Download as pdf or txt
Download as pdf or txt
You are on page 1of 348

Natural Computing Series

Series Editors: G. Rozenberg


Th. Biick A.E. Eiben J.N. Kok H.P. Spaink
Leiden Center for Natural Computing

Advisory Board: s. Amari G. Brassard K.A. De Jong


C.C.A.M. Gielen T. Head L. Kari L. Landweber T. Martinetz
Z. Michalewicz M.C. Mozer E. Oja G. Păun J. Reif H. Rubin
A. Salomaa M. Schoenauer H.-P. Schwefel C. Torras
D. Whitley E. Winfree J.M. Zurada
Ro Paton t oHo Bolouri oMo Holcombe
JoHo Parish oRo Tateson (Edso)

CoDtputation in
Cells and Tissues
Perspectives and Tools of Thought

With 134 Figures

~ Springer
Editors J. Howard Parish
Ray Paton' School of Biochemestry
and Molecular Biology
Hamid Bolouri
University of Leeds
Institute for Systems Biology Leeds LS2 9JT, UK
Seattle, \VA 98103, USA howard@bmb.ac.uk
hbolouri@systemsbiology.org
Series Editors
Mike Holcombe
G. Rozenberg (Managing Editor)
Department of Computer Science
rozenber@liacs.nl
University of Sheffield
Th. Băck, J.N. Kok, H.P. Spaink
Sheffield SI 4DP, UK
m.holcombe@dcs.shef.ac.uk Leiden Center for Natural Computing
Leiden University
Richard Tateson Niels Bohrweg 1
Future Technologies Group 2333 CA Leiden, The Netherlands
Intelligent Systems Lab
A.E.Eiben
BTexact Technologies
Vrije Universiteit Amsterdam
Ipswich IPS 3RE, UK
The Netherlands
richard. tateson@bt.com

Library of Congress Control N umber: 2004042949

ACM Computing Classification (1998): EO, 1.1-2, 1.6, J.3

ISBN 978-3-642-05569-0 ISBN 978-3-662-06369-9 (eBook)


DOI 10.1007/978-3-662-06369-9

This work is subject to copyright. Ali rights are reserved, whether the whole or part of the
material is concerned, specifically the rights of translation, reprinting, reuse of illustrations,
recitation, broadcasting. reproduction on microfilm or in any other way, and storage in data
banks. Duplication of this publication or parts thereof is permitted only under the provisions
of the German Copyright Law of September 9, 1965, in its current version, and permission for
use must always be obtained from Springer-Verlag Berlin Heidelberg GmbH.
Violations are liable for prosecution under the German Copyright Law.

springeronline.com
© Springer-Verlag Berlin Heidelberg 2004
Originally published by Springer-Verlag Berlin Heidelberg New York in 2004
Softcover reprint of the hardcover 1st edition 2004
The use of general descriptive names, registered names, trademarks, etc. in this publication
does not imply, even in the absence of a specific statement, that such names are exempt from
the relevant protective laws and regulations and therefore free for general use.
Cover Design: KiinkelLopka, Werbeagentur, Heidelberg
Typesetting: by the Authors
Production: LE-TEX Jelonek, Schmidt & Vockler GbR, Leipzig
Printed on acid-free paper 4513142/YL - 543210
It is with great sadness that we have to report the sudden death of Dr. Ray Paton,
the principal editor of this volume, just before we went to press. Ray worked tire-
lessly to bring this book to Jruition and it stands as a rich testament to his inspira-
tionalleadership and vision in the field.

Alt of the other editors wish to record our great gratitude to Ray, who was not
only an outstanding scientist but also agreat friend and colleague.

We hope that this book will, in some way, be looked upon as a memorial to Ray's
pioneering work in biologically inspired computing and computational biology.
Preface

The field of biologically inspired computation has coexisted with mainstream


computing since the 1930s, and the pioneers in this area include Warren
McCulloch, Walter Pitts, Robert Rosen, Otto Schmitt, Alan Turing, John von
Neumann and Norbert Wiener.
Ideas arising out of studies of biology have permeated algorithmics, automata
theory, artificial intelligence, graphics, information systems and software design.
Within this context, the biomolecular, cellular and tissue levels of biological
organisation have had a considerable inspirational impact on the development of
computational ideas. Such innovations include neural computing, systolic arrays,
genetic and immune algorithms, cellular automata, artificial tissues, DNA
computing and protein memories. With the rapid growth in biological knowledge
there remains a vast source of ideas yet to be tapped. This includes developments
associated with biomolecular, genomic, enzymic, metabolic, signalling and
developmental systems and the various impacts on distributed, adaptive, hybrid
and emergent computation.
This multidisciplinary book brings together a collection of chapters by
biologists, computer scientists, engineers and mathematicians who were drawn
together to examine the ways in which the interdisciplinary displacement of
concepts and ideas could develop new insights into emerging computing
paradigms. Funded by the UK Engineering and Physical Sciences Research
Council (EPSRC), the CytoCom Network formally met on five occasions to
examine and discuss common issues in biology and computing that could be
exploited to develop emerging models of computation. Many issues were raised
concemed with modelling, robustness, emergence, adaptability, evolvability and
networks, and many tools of thinking and ways of addressing problems were
introduced and discussed. This book seeks to highlight many aspects of this
growing area of study and will allow the reader to explore a breadth of ideas.

Ray Paton'
Biocomputing and Computational Biology Group
Department of Computer Science
May 2004 The University of Liverpool, UK
Contents

CytoComputational Systems - Perspectives and Tools of Thought •••••••••••••••••• 1


R. C. Paton
1 Plan of the Book ....................................................................................... 2
2 History of the CytoComputational Systems Project ................................. 6

Cells in Telecommunications •...•..••.....•••••.•••••••...••••••..•••.••••••.••.•.••••••••••••.••••••••••••• 9


R. Tateson
1 Introduction ............................................................................................... 9
2 Telecommunication Problems ................................................................ 11
3 Features of Cells ..................................................................................... 12
3.1 Evolutionary History ....................................................................... 12
3.2 Division History .............................................................................. 12
3.3 'Life History' ................................................................................... 12
3.4 Dynamic, Metabolic ........................................................................ 13
3.5 Autonomous .................................................................................... 13
3.6 Emergent Control ............................................................................ 14
4 Cell-based Solutions for Telecommunications Problems ....................... 14
4.1 Fruitflies and Mobile Phones ........................................................... 15
4.2 Design by Morphogenesis ............................................................... 17
4.3 CellSim ............................................................................................ 20
5 Conclusion .............................................................................................. 25
References ........................................................................................................ 25

Symbiogenesis as a Machine Leaming Mechanism.......................................... 27


L. Bull, A. Tomlinson
1 Introduction ............................................................................................. 27
2 Simulated Symbiogenesis ....................................................................... 28
2.1 The NKCS Model ............................................................................ 28
2.2 Genetic Algorithm Simulation ......................................................... 29
2.3 Results ............................................................................................. 31
2.4 Discussion........................................................................................ 33
3 Symbiogenesis in Machine Leaming ...................................................... 36
3.1 ZCS: A Simple Leaming Classifier System .................................... 36
3.2 Symbiogenesis in a Leaming Classifier System .............................. 38
3.3 Woods 1 ........................................................................................... 40
3.4 Symbiont Encapsulation .................................................................. 41
3.5 System Evaluation in Markov and non-Markov Environments ..... .44
4 Conclusion .............................................................................................. 48
References ........................................................................................................ 49

An Overview of Artificial Immune Systems ...................................................... 51


J. Timmis et al.
1 Introduction ............................................................................................. 51
X Contents

2 The Immune System: Metaphorically Speaking ..................................... 52


3 The Vertebrate Immune System ............................................................. 54
3.1 Primary and Secondary Immune Responses .................................... 55
3.2 B-cells and Antibodies .................................................................... 55
3.3 Immune Memory ............................................................................. 56
3.4 Repertoire and Shape Space ............................................................ 59
3.5 Learning within the Immune Network ............................................ 59
3.6 The Clonal Selection Principle ........................................................ 62
3.7 Self/Non-Self Discrimination .......................................................... 63
3.7.1 Negative Selection ........................................................................ 63
4 From Natural to Artificial Immune Systems .......................................... 64
4.1 Summary ......................................................................................... 65
5 The Immune System Metaphor............................................................... 66
5.1 A Framework for AIS ...................................................................... 66
5.2 Machine Learning ............................................................................ 68
5.3 Robotics ........................................................................................... 76
5.4 Fault Diagnosis and Tolerance ........................................................ 78
5.5 Optimisation .................................................................................... 79
5.6 Scheduling ....................................................................................... 81
5.7 Computer Security ........................................................................... 82
6 Summary ................................................................................................. 83
7 Comments on the Future for AIS ............................................................ 84
References ....................................................................................................... 86

Embryonics and Immunotronics: Biologically Inspired


Computer Science Systems ................................................................................ 93
A. Tyrrell
1 Introduction ............................................................................................ 93
2 An Overview of Embryonics .................................................................. 95
2.1 Multicellular Organization .............................................................. 95
2.2 Cellu1ar Division ............................................................................. 95
2.3 Cellular differentiation .................................................................... 95
3 The Organism's Features: Multicellular Organization, Cellular
Differentiation, and Cellular Division .................................................... 97
4 Architecture of the Cell .......................................................................... 98
4.1 Memory ........................................................................................... 99
4.2 Address Generator ......................................................................... 100
4.3 Logic Block ................................................................................... 100
4.4 Inputloutput Router ....................................................................... 101
4.5 Error Detection and Error Handling .............................................. 102
5 Examples .............................................................................................. 103
6 Immunotronics ...................................................................................... 105
7 Reliability Engineering ......................................................................... 106
8 The Reliable Human Body ................................................................... 107
9 Bio-Inspired Fault Tolerance ................................................................ 107
10 Artificial Immune Systems ................................................................... 108
11 Domain Mapping .................................................................................. 109
Contents XI

12 Choice of Algorithm ............................................................................. 109


13 Architecture of the Hardware Immunisation Suite ............................... 109
14 Embryonics and Immunotronic Architecture ........................................ l12
15 Conclusion ............................................................................................ 112
References ...................................................................................................... 114

Biomedical Applications of Micro and Nano Technologies ........................... 117


C. J. McNeil, K. 1. Snowdon
1 Background ........................................................................................... 117
2 Biomedical Applications of Nanotechnology ....................................... 118
3 Developing a Multidiscip1inary Base - The NANOMED Network ..... 120
4 Initial Challenges to NANOMED Problems ......................................... 122
5 Concluding Remarks ............................................................................. 123
References ...................................................................................................... 124

Macromolecules, Genomes and Ourselves ...................................................... 125


S. Nagl et al.
1 Preamble ............................................................................................... 125
2 Macromolecules: Properties and Classification .................................... 126
2.1 Architecture, Form and Function ................................................... 126
2.2 Data Resources .............................................................................. 129
2.3 Protein Classification ..................................................................... 129
2.4 Protein Signatures .......................................................................... 130
3 Models and Metaphors .......................................................................... 131
3.1 Proteins as Machines ..................................................................... 131
3.2 lnformation Processing by Proteins ............................................... 132
4 Modelling of Complex Cellular Systems
for Post-genomic Biomedicine ............................................................. 134
4.1 Introduction: A Systems View of Life ........................................... 134
4.2 Complexity and Post-genomic Biomedicine ................................. 139
4.3 New Models for Biomedicine: Ethical Implications of Model
Choice ............................................................................................ 140
4.4 Models as Metaphoric Constructions .............................................. 142
5 Conclusions ........................................................................................... 144
References ...................................................................................................... 145

Models of Genetic Regulatory Networks ......................................................... 149


H. Bo10uri, M. Schilstra
1 What are Genetic Regulatory Networks? ............................................. 149
2 What is a Gene? .................................................................................... 150
3 Regulation of Single Genes .................................................................. 151
4 Differences in Gene Regulation Between Organisms ........................... 151
5 Modeling GRNs .................................................................................... 152
6 Some GRN Models to Date .................................................................. 153
7 GRN Simulators .................................................................................... 155
8 Uses ofGRNs Beyond Biology ............................................................ 156
References ...................................................................................................... 156
XII Contents

A Model of Bacterial Adaptability Based on Multiple Scales


of Interaction: COSMIC .................•...•••.••••••••••.•.•.•••.............................•.....•... 161
R. Gregory et al.
1 Introduction .......................................................................................... 162
2 Biology .................................................................................................. 163
2.1 DNA, RNA and Proteins .............................................................. 163
2.2 Transcription ................................................................................ 164
2.3 Protein Structure ........................................................................... 165
2.4 Optional Transcription .................................................................. 166
2.5 lac Operon .................................................................................... 166
2.6 trp Operon .................................................................................... 167
2.7 An E. ecoli Environment .............................................................. 168
3 The Genome and the Proteome ............................................................ 169
4 Model ................................................................................................... 170
5 Implementation ..................................................................................... 173
6 Results .................................................................................................. 173
6.1 Environmental Macroscopic View ............................................... 173
6.2 CeH Lineage ................................................................................. 175
6.3 Gene Expression ........................................................................... 176
6.4 Network Graphs ........................................................................... 178
6.5 CeH Statistics ................................................................................ 178
7 Discussion ............................................................................................ 182
References .................................................................................................... 183

Stochastic Computations in Neurons and Neural Networks ......................... 185


J. Feng
1 Abstract ................................................................................................ 185
2 The Integrate-and-File Model and Its Inputs ........................................ 189
3 Theoretical Results ............................................................................... 191
3.1 Behaviour of a.(A)"2c,r) ............................................................... 192
3.2 Input-Output Relationship ........................................................... 195
4 Informax Principle ................................................................................ 197
4.1 The IF Model Redefined .............................................................. 197
5 Leaming Rule ....................................................................................... 198
6 Numerical Results ................................................................................ 202
6.1 Supervised Leaming ..................................................................... 202
6.2 Unsupervised Leaming ................................................................. 203
6.3 Signal Separations ........................................................................ 204
7 Discussion ............................................................................................ 205
References .................................................................................................... 209

Spatial Patterning in Explicitly Cellular Environments:


Activity-Regulated Juxtacrine Signalling ••.•...................................................• 211
N. Monk
1 Introduction .......................................................................................... 211
2 Biological Setting ................................................................................. 212
3 Mathematical Models of Juxtacrine Signalling .................................... 213
Contents XIII

4 Pattern Fonnation ................................................................................. 215


4.1 Lateral Inhibition and Spacing Patterns ......................................... 215
4.2 Gradients and Travelling Fronts .................................................... 218
4.3 More Complex Spatial Patterns ..................................................... 220
5 Further Developments ........................................................................... 223
References ...................................................................................................... 223

Modelling the GH Release System ................................................................... 227


D. J. MacGregor et. al.
1 Introduction ........................................................................................... 227
2 Research Background ........................................................................... 227
3 GH Research ......................................................................................... 228
3.1 Experimental Approach ................................................................. 229
3.2 Anatomical Results ........................................................................ 230
3.3 Electrophysiological Results ......................................................... 231
3.4 Behavioural Results ....................................................................... 231
4 Creating the System .............................................................................. 233
4.1 Simplifications ............................................................................... 235
5 Making the Experimental Model .......................................................... 235
5.1 Storage Variables ........................................................................... 236
5.2 Input Protocols ............................................................................... 236
6 The Model. ............................................................................................ 237
6.1 The Pituitary Model ....................................................................... 237
6.2 The GH System Model .................................................................. 238
7 Working with the Model ....................................................................... 240
7.1 The Model Parameters ................................................................... 241
7.2 Assessing Perfonnance .................................................................. 241
7.3 Initial Results ................................................................................. 242
7.4 Comparison with real GH release .................................................. 242
7.5 A GHRH-Somatostatin Connection ............................................... 244
7.6 GH-Somatostatin Stimulatory Connection .................................... 245
8 Conc1usions ........................................................................................... 246
References ...................................................................................................... 248

Hierarchies of Machines ..................................................................•.••..•••.••••.•• 251


M. Ho1combe
1 Introduction: Computational Models .................................................... 251
2 More Powerful Machines ..................................................................... 255
2.1 X-machines .................................................................................... 255
2.2 Communicating X-machines [6]. ................................................... 259
2.3 Hybrid Machines ........................................................................... 260
3 Agents and Agent Systems ................................................................... 262
4 Hierarchies of Machines ...................................................................... 264
4.1 Cellular Hierarchies ...................................................................... 264
4.2 Tissue Hierarchies ......................................................................... 266
5 Conc1usions and Further Work ............................................................. 267
References ...................................................................................................... 268
XIV Contents

Models of Recombination in Ciliates ............................................................... 269


P. Sant, M. Amos
1 Introduction and Biological Background .............................................. 269
1.1 IESs and MDSs ............................................................................. 270
1.2 Scrambled Genes ........................................................................... 271
1.3 Fundamental Questions ................................................................. 271
2 Models of Gene Construction ............................................................... 272
3 Discussion ............................................................................................. 275
References ..................................................................................................... 276

Developing Algebraic Models of Protein Signalling Agents .......................... 277


M. J. Fisher et al.
1 Proteins as Computational Agents ........................................................ 277
2 Protein Information Processing Networks ............................................ 280
3 Towards an Aigebraic Model of Protein Interactions ........................... 282
References ..................................................................................................... 286

Categorical Language and Hierarchical Models for Cell Systems ................ 289
R. Brown et al.
1 Introduction .......................................................................................... 289
2 Category Theory: History and Motivation ........................................... 290
3 Categorical Models for Hierarchical Systems ...................................... 292
4 Conc1usion ............................................................................................ 302
References .................................................................................................... 302

Mathematical Systems Biology: Genomic Cybemetics .................................. 305


O. Wolkenhauer et al.
1 Introduction: Action versus Interactions ............................................... 306
2 Integrating Organisational and Descriptional Levels of Explanation ... 307
3 Scaling and Model Integration .............................................................. 310
4 Theory and Reality: Experimental Data and Mathematical Models .... 312
5 Mathematical Systems Biology: Genomic Cybernetics ........................ 315
6 Dynamic Pathway Modeling as an Example ........................................ 316
7 Summary and Conc1usions ................................................................... 323
References ..................................................................................................... 324

What Kinds of Natural Processes can be Regarded as Computations? ...... 327


C. G Johnson
1 Introduction .......................................................................................... 327
2 Computer Science or Computer Science? ............................................. 328
2.1 Complexity of Natural Computation ............................................. 328
2.2 Simulation of Natural Systems ...................................................... 329
3 Grades of Possibility ............................................................................. 330
3.1 Computational Possibility ............................................................. 331
4 Can a Change not be a Computation? ................................................... 332
4.1 Observability ................................................................................. 332
4.2 Consistent Ascribing of Symbols .................................................. 333
Contents XV

4.3 Digital Encoding ............................................................................ 334


4.4 Flexibility of Inputs ....................................................................... 334
4.5 Intention to Initiate a Change ........................................................ 334
4.6 Summary ........................................................................................ 335
References ...................................................................................................... 335

List of Contributors ........................................................................................... 337

Index ................................................................................................................... 341


CytoComputational Systems - Perspectives and
Tools of Thought

R. Patod

Department of Computer Science, University of Liverpool, Chadwick Building,


Peach Street, Liverpool L69 7ZF, United Kingdom

Cells are complex systems. For some people, they are like tiny chemical plants, or
laboratories, or machines. For others they are more like computational devices or
even computational systems. As data continue to be generated about them and
their components and the systems that they make up, new perspectives and models
are needed to deal with the complexity. Cells are more than bags of chemicals just
as macromolecules are more than microscopic billiard balls or strings of beads.
The challenges of an information processing view that complements the more
commonly expressed chemical processing view needs to be taken into account.
The biomolecular, cellular and tissue levels of biologic al organisation have had
considerable inspirational impact on the development of computational models
and systems. Such innovations include neural computing, systolic arrays, genetic
and immune algorithms, cellular automata, artificial tissues, molecular computing,
and protein memories. With the rapid growth in biological knowledge there
remains a vast source of ideas yet to be tapped. These include developments
associated with biomolecular, genomic, enzymic, metabolic and signalling
systems and the various impacts on distributed, adaptive, hybrid and emergent
computation. Many biologists use language that is displaced from computer
sciences not least, program (as in apoptosis), hardware-software, DNA - as data
or program, code, gate, Boolean network, pattern recognition, and so forth.
Indeed, many proteins (such as enzymes and transcription factors) carry out
complex information processing tasks (as individual molecules) including pattern
recognition, switching, logic al decision-making, memory, and signal integration.
This book provides readers with a comprehensive exploration of this subject
from a uniquely multidisciplinary point of view. Contributions from biologists,
computer scientists, engineers and mathematicians are drawn together to provide a
comprehensive picture of biological systems, both as sources for ideas about
computation and as information processing systems. The varieties of perspectives
that are presented provide an integrative view of a complex and evolving field of
knowledge that needs new tools of thought to manage and mobilise the vast
opportunities afforded to the postgenomic and biocomputational sciences.
2 R. Paton

1 Plan of the Book

The book begins with several chapters that represent a (bio )mimetic approach to
engineering and computation. Tateson, a biologist by training who now works for
BTexact Technologies, looks at the application of cellular analogies to
telecommunication systems. He argues that in some cases there are sound reasons
for believing that analogies will be helpful. We can identify biological systems,
which resemble in some sense an artificial system, and then use our knowledge
and understanding of the functioning of the biological system to improve or
redesign the artificial system. In other cases there is little more than the assertion
that 'nature knows best' and hence any artefact modelled on nature must be
superior to an artefact for the same purpose devised by human reason alone. This
is not a useful basis for redesigning our artificial systems, and in fact the analogy
with nature is often 'bolted on' to the human-designed system to explain to non-
engineers how it works rather than being genuinely useful at the design stage.
The chapter by Bull and Tomlinson shows how symbiogenetic mechanisms
found at the cellular level can be successfully applied to computational leaming.
Symbiosis is the phenomenon in which organisms of different species live
together in close association, potentially resulting in a raised level of fitness for
one or more of the organisms. Symbiogenesis is the name given to the process by
which symbiotic partners combine and unify - fonning endosymbioses and then
potentially transferring genetic material - giving rise to new morphologies and
physiologies evolutionarily more advanced than their constituents. This process is
known to occur at many levels, from intra-cellular to inter-organism. They use the
abstract NKCS model of coevolution to examine endosymbiosis and its effect on
the evolutionary performance of the entities involved. They suggest the conditions
under which endosymbioses are more likely to occur and why; we find they
emerge between organisms within a window of their respective "chaotic gas
regimes" and hence that the association represents a more stable state for the
partners. This general result is then exploited within a machine leaming
architecture to improve its performance in non-Markov problem domains.
Timmis, Knight, Castro and Hart discuss the growing field of Artificial
Immune Systems (AIS) - that is using the natural immune system as a metaphor
for solving computational problems. The field of AIS is relatively new and draws
upon work done by theoretical immunologists such as Jeme, Perelson, and Varela.
What is of interest to researchers developing AIS is not the modelling of the
immune system, but extracting or gleaning useful mechanisms that can be used as
a metaphor to help solve particular problems. It is quite common to see gross
simplifications of the way the immune system works, but this is not a problem as
it is inspiration computer scientists seek from nature rather than precise
mechanisms. The review is organised in the following manner. First, reasons for
why the immune system has generated such interest and is considered to be a good
metaphor to employ are discussed. This is followed by a simple review of relevant
immunology that creates many of the foundations for work reviewed in this
contribution. Immunology is a vast topic and no effort has been made to cover the
CytoComputational Systems - Perspectives and Tools of Thought 3

whole area. Rather, only those ideas that have proved to be useful to the majority
of research presented in this contribution are explained in some detail. Finally, a
summary of the work presented in this contribution is provided, drawing main
conclusions from the work presented and commenting on the perceived future of
this emerging technology.
Tyrrell considers the analogy between multi-cellular organisms and multi-
processor computers as not too far-fetched, and well worth investigating,
particularly when considering that nature has achieved levels of complexity that
far surpass any man-made computing system. The aspect of biological organisms
on which this chapter is centred is their phenomenal robustness: in the trillions of
cells that make up a human being, faults are rare, and in the majority of cases,
successfully detected and repaired. The Embryonics project (for embryonic
electronics) is inspired by the basic processes of molecular biology and by the
embryonic development of living beings. By adopting certain features of cellular
organisation, and by transposing them to the two-dimensional world of integrated
circuits in silicon, it will be shown that properties unique to the living world, such
as self-replication and self-repair, can also be applied to artificial objects
(integrated circuits). Self-repair allows partial reconstruction in case of a minor
fault, while self-replication allows complete reconstruction of the original device
in cases where a major fault occurs. These two properties are particularly desirable
for complex artificial systems in situations that require improved reliability. To
increase still further the potential reliability of these systems, inspiration has also
been taken from biological immune systems - Immunotronics. The acquired
immune system in humans (and most vertebrates) has a mechanism for error
detection which is simple, effective and adaptable.
The chapter by McNeil and Snowdon serves to provide a further conceptual
bridge between hardware and biology, this time by working with molecules at the
nanoscale. The dawn of nanoscale science can be traced to a now classic talk that
Richard Feynman gave on December 29th, 1959 to the annual meeting of the
American Physical Society at the California Institute of Technology. In this
lecture, Feynman suggested that there exists no fundamental reason to prevent the
controlled manipulation of matler at the scale of individual atoms and molecules.
Twenty one years later, Eigler and co-workers constructed the first man-made
object atom-by-atom with the aid of a scanning tunnelling microscope. Given that
there is "Plenty of room at the bottom" (the title of Feynman's talk), and biological
systems have highly subtle and sophisticated meso- and micro-scale architectures,
the exploitation of this level in medical and computational technologies will
continue to challenge 21 st century biocomputational science.
The kind of approach discussed in the previous chapters has an established
record since the 1940s and the developments in cybernetics, digital electronics,
and general models of computation. Thus we find the developments in digital
models of neurones (the McCulloch-Pitts model), computational models of brains,
and the origins of cellular automata. Many of the ideas that were spawned in the
1940s and 1950s, and many tools of thought for helping scientists and engineers to
organise their knowledge of biological systems can be traced back to this time (as
4 R. Paton

can the revolution that took place in molecular biology). The viewpoint now
moves along towards ways in which the languages of physical science and
mathematics can enhance our appreciation of biological systems.
A common observation made by scientists who are working at multi-
disciplinary interfaces (such as CytoComputational Systems) relates to the
problems encountered by differences in vocabulary, emphasis, modelling
approach, and attitudes to reduction and simplification. The next chapter looks at
some ways displacements of ideas between the disciplines can take place. Nagl,
Parish, Paton, and Wamer consider ways of describing computational topics in
molecular and cellular biology. Methods of classifying DNA, RNA and proteins
are central to current methods for elucidating relationships between sequence,
structure and function. The chapter looks at metaphors for the function of proteins
and points to a unified view of proteins as computational devices capable of
matching patterns as inputs and processing to re suIt in alternative outputs. The
requirement for a systems view of life is also pursued. As such this chapter
provides an immediate bridge between the previous few and next few chapters. In
addition it also anticipates a number of themes emerging in later chapters (such as
Fisher et al. and Wolkenhauer et al.).
Following on from this computational stance, Bolouri and Schilstra provide a
short review of the modelling of Genetic Regulatory Networks (GRNs). GRNs
have a basic requirement to model (at least) some parts of a biological system
using some kind of logical formalism. They represent the set of alI interactions
among genes and their products for determining the temporal and spatial patterns
of expres sion of a set of genes. The origins of modelling the regulation of gene
expres sion go back to the Nobel Prize winning work of Lwoff, Jacob and Monod
on the mechanisms underlying the behaviour of bacterial viruses that switch
between so-called lytic and lysogenic states. The authors briefly discuss some of
the circuit-based approaches to GRNs such as the work of Kauffman, Thomas, and
Shapiro and Adams.
The next two chapters address computational modelling of cells using very
different approaches. The chapter by Gregory, Paton, Saunders and Wu reports on
work concerned with Individual-Based Models (IBMs) of a 'virtual' bacterial cell.
The goal of this project has been to explore the ecological and evolutionary
trajectories of 'artificial bacteria'. This is agreat challenge both to understanding
the cell in sufficient detail and to implementing a system on current computational
architectures. Each bacterium is an independent agent with sufficient genomic and
proteomic equivalents built into the model such that each individual cell can have
up to 120 genes and 50,000 gene products. The chapter reports on the
development of a model that has to incorporate multiple scales in both time and
space.
Feng's chapter develops some mathematical models of stochastic computations
in neurones and neural networks. It is a bridge between the cell-based work
discussed previously, the tissue-based work we look at next, and the bioinspired
approaches that started the book. Feng discusses the developments of his neuronal
decision theoretic approach in relation to the role played by inhibitory inputs. The
CytoComputational Systems - Perspectives and Tools of Thought 5

cellular components execute a leaming rule and the networks that are produced
can be applied to statistical pattern recognition problems.
The next two chapters provide very interesting insights into the workings of
mathematical modellers as their work addresses very specific biological problems.
Monk's chapter looks at his work dealing with spatial patterning in explicitly
cellular environments. Pattern formation in multicellular organisms generally
occurs within populations of cells that are in close contact. It is thus natural and
important to consider models of pattern formation that are constructed using a
spatially discrete cellular structure. Here, the particular case of pattern formation
in cellular systems that depends on contact-dependent (juxtacrine) signalling
between cells is discussed.
At another scale of biological organisation, MacGregor, Leng and Brown
describe a model of the hypothalamic and pituitary components involved in
controlling growth hormone release. Their model has been developed by gathering
and attempting to formali se the experimental data about the system but has been
kept as simple as possible, focusing on the functional rather than mechanical
properties of its components. They show that a relatively simple model can be
capable of producing complex behaviour and accurately reproducing the
behaviour and output of a real brain system.
We now address a collection of modelling approaches that build on a
computational perspective (that is, the underlying metaphor is focused on
computation or information processing rather than the dynamics expressed in
models based on continuous mathematics). Holcombe notes how computational
models have been of interest in biology for many years and have represented a
particular approach to trying to understand biological processes and phenomena
from a systems point of view. One of the most natural and accessible compu-
tational models is the state machine. These come in a variety of types and possess
a variety of properties. This chapter discusses some useful ones and looks at how
machines involving simpler machines can be used to build plausible models of
dynamic, reactive and developing biological systems that exhibit hierarchical
structures and behaviours.
Another computational stance that has been related to coding/decoding issues
concerns the sequence structures and scrambling of genes. Sant and Amos
examine some recent work on models of recombination in Ciliates. They describe
how these single-celled organisms 'compute' by unscrambling their genetic
material. They present a number of models of this process, and discuss their
implications. This could have useful implications for the development of cellular
computers.
In contrast to the previous chapter, which looks at genes and DNA, the chapter
by Fisher, Malcolm and Paton looks at proteins and, specifically, the modelling of
signalling proteins as algebraic agents. Harking back to the previous chapter by
Nagl et al., the authors begin by looking at proteins as computational agents.
Protein information processing networks are discussed, notably secondary
messenger signaling, signalling kinases and phosphatases, scaffold proteins and
6 R. Paton

protein-protein interactions. The final section of the paper develops an algebraic


model of protein interactions based on rewrite rules.
The final part of the previous chapter points to an area of mathematics called
Category Theory that has variously been applied to problems in theoretical
biology since the 1960s. However, it is inaccessible to many non-mathematicians
and yet is very useful at helping to integrate knowledge. The short piece by
Brown, Paton and Porter seeks to give non-specialists, and especiaHy biologists,
an accessible introduction to the subject especiaHy in relation to hierarchical
structures.
Wolkenhauer and Kolch present an approach that can be used to investigate
genome expression and regulation through mathematical systems theory. The
principal idea is to treat gene expression and regulatory mechanisms of the ceH
cycle, morphological development, ceH differentiation and environmental
responses as controlled dynarnic systems. Although it is common knowledge that
cellular systems are dynarnic and regulated processes, to date they have not been
investigated and represented as such. The kinds of experimental techniques that
have been available in molecular biology largely determined the material
reductionism, which describes gene expression by means of molecular chara-
cterisation. Instead of trying to identify genes as causal agents for some function,
role or change in phenotype they relate these observations to sequences of events.
The final chapter by Johnson can be viewed as one summary approach to some
of the issues the CytoCom Network had to address, namely the kinds of natural
processes that can be regarded as computations. In recent years the idea of using
computational concepts as a way of understanding biologic al systems has become
of increasing importance; this conceptual use of computational ideas should be
contrasted with the equally valuable activity of using computers as tools for
interpreting biological data and simulating biological systems. He suggests that
this computational attitude towards biological systems has been valuable in
computer science itself, too; by observing how biological systems solve problems,
new algorithms for problem solving on computers can be developed.

2 History of the CytoComputational Systems Project

The CytoCom project was one of a number of networks of scientists funded by the
UK Engineering and Physical Sciences Research Council looking at Emerging
Computing Paradigms. In our case we were using non-neural cells and tissues as
the area of focused study and discussion. We held five workshops between 1999
and 2001, in Liverpool, Leeds, Hertford, Sheffield and London. This book,
together with a number of other publications, further networks and funded
research, were some of the achievements. A less quantifiable achievement was the
general increase in awareness of this level of biologic al organisation as a source of
computational ideas. CytoCom grew from an existing, though loose community of
scientists interested in and attending an international series of workshops called
IPCAT (Information Processing in Cells and Tissues). The first IPCAT was held
CytoComputational Systems - Perspectives and Tools of Thought 7

in Liverpool in 1995 and since then we have had workshops every other year in
Sheffield (1997), Indianapolis (1999) and Leuven (2001). The fifth workshop was
held in Lausanne (2003) and the sixth is planned for York in 2005.

Acknowledgement. CytoCom was made possible through support from the UK


Engineering and Physicai Sciences Research Council (EPSRC) and we are
especialIy grateful to Jim Fleming and Mark Hylton for alI their help and
encouragement. We are also very gratefui to Helen Forster for alI her help and
administrative support.
Cells in Telecommunications

R. Tateson

Future Technologies Group, Intelligent Systems Lab, Bt Exact Technologies,


PPlI12, Orion Bldg., Adastral Park, Martlesham, Ipswich IP 3RE, UK
richard.tateson@bt.com

Abstract. There are many examples of the natural world providing inspiration for
human engineers and designers. Cell biology is one branch of the natural sciences
which has not yet been widely exploited in this way, but which has great potential
for application, particularly in the telecommunications area. The features of cells
map strikingly well on to some of the challenges in current engineering and design
for telecommunications systems. The autonomy, evolution, adaptivity and self-
organisation of cells are alI desirable for the complex, dynamic and geographically
distributed networks we are now constructing and using. Three examples of cur-
rent research illustrate how analogies from cells can lead to radically different
telecommunications systems. Cell fate behaviour in fruitfly cells has inspired a
new, decentralised approach to managing mobile phone networks. Morphogenetic
events in early embryos point to new design methods which add depth to the es-
tablished, and also biologically inspired, techniques of evolutionary optimisation.
Genetic control pathways in bacteria inspire implicit leaming techniques which al-
low individual 'cells' in simulation to discover adaptive behaviour without an ex-
plicit definition of 'fitness'. All of the examples are at the research stage, and will
not be used in real networks until 2004 and beyond. However, they give a glimpse
of the strength of the analogy between biological cells and elements in telecom-
munications systems, and suggest that this will be a productive area of work into
the future.

1 Introduction

This chapter explores the application of cellular analogies to telecommunications.


First it is necessary to give some justification for even discussing this marriage of
the cytological with the informational. To begin at a more generallevel, there is an
extensive history of looking to nature when attempting to engineer the artificial. A
very readable treatment of this from the mechanical and structural engineering
perspective is given by Steven Vogel in 'Cat's Paws and Catapults'. Some of the
historical successes mentioned include hydrodynamics inspired by dolphin body-
shape; aerodynamics inspired by birds' wings, papermaking inspired by wasps'
nests and Velcro inspired by plant seed 'burs'. There is even an historical prece-
dent for biological inspiration in telecommunications, though not at the cellular
level: Alexander Graham Bell was inspired by his knowledge of the human ear-
10 R. Tateson

drum to invent a telephone transmitter and receiver which relied on the vibration
of a 'membrane' to turn sound to an electric al signal and back again.
Biological analogies are most powerful when we identify a biological system
which resembles in some sense an artificial system, and then use our knowledge
and understanding of the functioning of the biological system to improve or redes-
ign the artificial system. In recent years two areas of biology have proved fertile
for inspirations and analogies in the computation and telecommunications field.
Social insects, particularly ants, have been a rich source of ideas for telecom-
munications networks (reviewed in [1]). The most famous example is the analogy
between foraging behaviour of ants and routing of information packets. The ants
leave pheromone trails which can be followed by the other ants seeking food. On
finding food, ants retum to the nest, laying a trail as they go. In the case where
there are two (or more) alternative routes back to the nest, the shortest route tends
to accumulate pheromone fastest and becomes the preferred route. This is efficient
and hence desirable for the ants and is achieved without any 'bird's-eye view' or
other global map, and without any reasoning or central control. In a communica-
tions network, packets of information are routed from origin to destination by a
number of 'hops' from node to node in the network. Shorter routes are desirable
again for reasons of efficiency. The routing problem is usually solved by provid-
ing each node in the network with a list of destinations so that it 'knows' where to
send any newly arrived packet. The lists are caIculated centrally and disseminated
to the nodes. The ant-inspired alternative is to use the packets themselves to up-
date the lists. Just as the ants leave pheromone trails, the packets leave their influ-
ence on the lists, and tend to reinforce short routes at the expense of long ones.
The other significant area for biologically inspired telecommunications is evo-
lutionary computation. This is a large and active academic and applied field
which, despite the wide variety of specific approaches, shares as its underlying in-
spiration evolution by natural selection [2,3]. If the individuals of a population
produce offspring which inherit some of the parental characteristics, but with
some variation, and if the number of offspring produced by an individual is in
proportion to the 'fitness' of the individual, then as the generations go by the indi-
viduals in the population will tend to become 'fitter'.
In the natural world this is evolution as described by Darwin, and has generated
the vast wealth and range of life on Earth. The individuals are the organisms, be
they bacteria, elephants, redwoods or algae. If the individuals are instead solutions
to some artificial problem we can use the evolutionary process to find better solu-
tions. For example it is possible to represent the telephone wiring for a new hous-
ing development as an 'individual' which can then be given a 'fitness' according
to how much it will cost to build (cheaper is fitter!) and how well it meets the ex-
pected needs of the occupants. With a population of these solution individuals an
evolutionary process can get going. Initially perhaps most of the solutions will be
fairly poor, but some will be less poor than others, and these will have more 'chil-
dren' which resemble them, but with some variation. Over time the solutions im-
prove. There are many complicating factors which must be addressed to make this
kind of 'optimisation' effective, in particular to avoid getting 'stuck' with a popu-
lation of individuals which have improved on their parents but are now trapped in
Cells in Telecommunications 11

a 'local optimum' - a small hillock in the fitness landscape from which they can-
not escape to climb the 'global optimum' mountain.
Artificial evolution has been applied to many optimisation problems, and is
currently used as an effective and time-saving part of the planning process for new
local telephone networks [4].
What is it that leads us to believe cellular biology can also be genuinely useful
as a source of design ideas, rather than a mere post-hoc rationale for our engi-
neered system? The answer can be summed up in two words: 'distribution' and
'autonomy'. These are the key features which will facilitate huge advances in
computing and telecommunications, and these are the key attributes of cellular bi-
ology. If we take a look at the kind of problems which are becoming increasingly
pressing in the telecommunications field we can see that in many ways they match
the general features of cellular systems.

2 Telecommunication Problems

The extraordinarily rapid rise of the power and capability of computers and tele-
communications in the second half of the 20th century is founded on silicon semi-
conductor technology organised in the 'Von Neumann' architecture. Data and a
program are held in memory. According to the sequence of instructions in the pro-
gram, data are brought from the memory to the central processing unit, some logi-
cal function is performed and the results placed in memory. This way of process-
ing information is extremely well suited to general purpose, human engineered
computation. The series of logic al steps can be programmed by one person and
understood by another. Also the continuing advances in silicon chip design have
allowed the central processor, which carries the burden of getting through the (of-
ten very many) logical steps which constitute a program, to keep pace with the re-
quirement for computing power.
This principle of centrali sed information processing is adhered to, for reasons
of engineering tractability, in systems large (such as a mobile phone network) and
small (such as an individual personal computer).
However, there are weaknesses inherent in the conventional approach:
• A centralised design principle. Information must be transported to the central
processor, processed and removed.
• Serial processing. The data are proces sed one after another and the program is
interpreted sequentially.
• Brittle architecture. The hardware and software are not tolerant of errors and
imperfections. Small mistakes or impurities will often lead to system failure.
• Static injlexible solutions. A system or network is often tailored to an assumed
environment (e.g. assumptions about traffic demand in various parts of a net-
work); ifthat situation changes performance can de grade substantially.
The spectacular progress of conventional systems in the late 20th century
means that at the start of the 21st century we have software and hardware of a
12 R. Tateson

scale and complexity which begins to expose these weaknesses. Increasingly we


will find that these are the limiting fac tors on continued advances in the capabili-
ties of our systems and networks [5).

3 Features of Cells

Of course there is no such thing as a 'generic' cell. Each real example of a bio-
logical cell is specialised for its functional niche. However, it is possible to look at
cells at a sufficiently abstract le vei to identify some common features shared by ali
cells from E.coli to epithelial cells.
• Evolutionary history
• Division history
• 'Life history'
• Dynamic, metabolic
• Autonomous
• Emergent control

3.1 Evolutionary History

One thing which all natural cells share is an evolutionary lineage. Every one has
an unbroken sequence of ancestors stretching back 3.5 billion years. Needless to
say, evolution by natural selection is a central process for life on Earth and the
concept is essential for understanding contemporary cell biology. The power of
this evolutionary process as a means for achieving functioning systems without
explicit goal-driven design has been a source of inspiration for artificial systems
for several decades.

3.2 Division History

In addition to the implications of its long evolutionary history for its current form
and range of function, there is the effect of the more recent ancestry. That is: over
time scales which allow no significant evolutionary change there is an influence of
ancestors in terms of choosing which subset of the cell's potential functions is ac-
tually expressed. For example, in a multicellular organism the celilineage during
development has an important impact on the form and function of any individual
cell.

3.3 'Lite History'

And the finest-grained 'historical' influence on the cell is the impact of events
which have impinged on the cell during its lifetime. Two cells with identical evo-
Cells in Telecommunications 13

lutionary and division lineages may behave in very different ways depending on
the signals they have received. Again a dear example can be taken from a mut-
licellular animal: the fruitfly Drosophila. All the cells in an individual animal
c1early share an evolutionary lineage. If we look at the peripheral nervous system
during development, we can find a single cell which is about to divide. The two
daughter cells c1early share a division lineage, yet one of them will go on to pro-
duce the bristIe which projects through the cutic1e to the 'outside world' while the
other will make the neuron which transmits the sensory information from that bris-
tIe to the central nervous system. There are very many documented examples of
this kind of decision-making process in unicellular and multicellular organisms.
Indeed it is possible to view the development of any organism as a combination of
division lineage and life history of cells.

3.4 Dynamic, Metabolic

How does a cell 'know' what it 'should' do next? What is the nature of its internal
state which is influenced by division lineage and life history, and which deter-
mines current cell behaviour and the range of possible reactions to stimuli? At an
instant in time we can in principle take a 'snap shot' of the gene expression state
of a cell. That is to say the degree to which each gene in the genome of the cell is
'turned on' or 'expressed'. We could also make an instantaneous measure of the
level of alI the chemicals of the cellular machinery - the structures, signals and
food of the cell. Together these things might define a cell's state. In practice, even
with modern genomic, proteomic and metabolomic techniques it is not possible to
perform a complete inventory of a cell. In practice people studying cell behaviour
will measure the levels of expression of a carefully selected subset of the cell's
genes, or will measure the concentrations of a subset of the cellular chemicals.
All of these things are dynarnic - the chemical concentrations and the levels of
gene expres sion have not reached a static stable point. In some cases the 'snap
shot' of the cell conceals the fact that the chemical or gene expression being
measured is changing rapidly. In other cases there is a stable level which is main-
tained for a significant time (seconds to years) but this is a dynarnic equilibrium-
chemicals are being broken down and built up. This is the essence of metabolism:
a balance between the catabolic (breaking things down) and the anabolic (building
things up). This 'churning' of cellular components often strikes the human eye as
wasteful but because it gives the cell excellent responsiveness to change, and al-
lows (or indeed enforces) the pooling of cellular resources, it makes an extremely
effective strategy in a world where the raw materials and energy are out there for
whoever can get them.

3.5 Autonomous

Cells are autonomous - their membranes encapsulate their decision-making and


action-taking machinery. A cell sits, bathed in external influences, and it will react
to those influences, but it does not suspend decision-making while waiting for in-
14 R. Tateson

structions from 'higher up', indeed it has no hierarchy of control. Hierarchies in


the cellular world are produced structurally - an axon in the optic nerve is in a
sense awaiting a signal from the cells of the retina, and this hierarchy exists be-
cause of the positions of these cells in the body. But the axon is still autonomously
running its own internal metabolism, and will 'happily' convey signals from
sources other than the retina. Anything which depolarises its membrane will cause
it to fire.

3.6 Emergent Control

The control of cell behaviour is 'emergent' - that is it arises from the interactions
of the many small parts of the ceH to give an outcome which can subsequently be
observed and described at a higher level. This is true over 'design time' (Le. evo-
lutionary time) and 'operational time' (i.e. the life of an individual ceH).
For example we may observe the behaviour of E. coli as it alters its metabolism
to use lactose instead of glucose as its energy source. We can understand the mo-
lecular interactions in terms of their higher level behaviour as a switch, and we
can see the logic of that switch in terms of its action and its benefit to the bacte-
rium. Analysis is made possible by treating the control pathway in isolation from
'irrelevant' context, such as the activity of Other control pathways in the same ceH.
None of these perspectives is available to the bacterium - the pathway has
evolved, and now functions, without any abstraction to higher level function, and
without the ability to ignore its cellular context.

4 Cell-based Solutions for Telecommunications Problems

So we have a man-made world of predominantly centralised, static, serial solu-


tions, which have served us weH but are now showing their limitations. In the
natural world, at the ceHular level we tind predominantly dynamic, autonomous,
emergent solutions which depend on history and local context rather than intelli-
gent design, for their 'fit' to the problem at hand. Is it possible to use lessons from
the ceHs to allow our artificial systems to continue to improve beyond current
limitations? Three examples suggest that we can. The first uses ideas from ceHular
'life history', and autonomy to suggest ways of making mobile phone networks
flexible and adaptable to user demands regardless of network size. The second
uses evolutionary history, lineage history and life history to show that cells can be
used as the 'designers' of multicellular systems. The third focuses on emergent
control, along with evolution, to show that ceHs can learn to solve problems even
when the problem itself has not been explicitly described.
Cells in Telecommunications 15

4.1 Fruitflies and Mobile Phones

Developing cells have many different types of signal and receptor molecules
which allow them to communicate with neighbouring cells. A given type of signal
molecule will bind specifically to one, or a few, types of receptor. As a conse-
quence of signal binding to receptor, a message is usually passed into the cell on
whose surface the receptor is carried.
One example of such a pair of signal and receptor is Delta (the signal) and
Notch (the receptor). Versions of these molecules are found in many different
animals, from nematodes (c. elegans) to humans (H. sapiens) [6]. They are used
for communication between cells at many stages in development. For example, in
the fruitfly Drosophila, they are needed for correct nervous system formation in
the early embryo and for developing the correct wing shape during pupation.

Fig. 1. The mutual inhibition process. Concentrating on just two adjacent cells, we can see
the 'flip-flop' switch resulting from the molecular interactions. The red arrow represents the
Delta signal. The Notch receptor is shown in green. An inhibitory signal (blue) reduces the
amount of Delta produced in a cell whose Notch receptors have been activated by its neigh-
bours. In this example, panel A shows the initial state in which both cells produce roughly
equal amounts of inhibitory Delta. B the upper cell begins to gain the upper hand. C The
upper cell has decided to make a bristle, and forced its neighbour not to do so

Although there are other molecules which affect the communication between
Delta and Notch, the core of their interaction is simple (Fig. 1). The binding of
Delta to Notch causes a protein to enter the nucIeus of the Notch-carrying cell and
alter the state of expression of the DNA. The effect of the alteration is that the
cell's production of Delta is reduced. Thus its ability to send a signallike the one
it has just received is diminished. Production of Delta is linked to a choice of cell
type. In other words the cells having this conversation are 'contemplating' becom-
ing one sort of cell (for example neural), which correlates with Delta production,
rather than another (for example epidermal). The exact nature of the choice de-
pends on the particular developmental process at hand; the Delta and Notch mole-
cules are not concerned with that, just with ensuring that a choice is made. So, if a
cell perceives a Delta signal from another cell, it makes less Delta itself and hence
becomes both less able to signal Delta to its neighbours, and less likely to choose
the Delta-correlated cell type.
16 R. Tateson

4.1.1 Frequency Allocation


This developmental mechanism was the inspiration for a new method for solving a
well-known problem in mobile telephony networks. Known as the frequency allo-
cation problem, it arises from the concept of frequency re-use. The limited number
of radio frequencies available to the operator of a mobile telephone network must
be allocated to the much larger number of base stations in the network such that
those base stations have enough frequencies to meet demand for calls without re-
using a frequency already being used by another nearby base station.
The method relies on the analogy between cells and base stations. Each base
station is made to negotiate for use of each of the frequencies, just as the cells ne-
gotiate for cell fate choice. For every frequency, a ceH has a simulated 'Notch re-
ceptor' and is synthesising simulated 'Delta' in an effort to inhibit its neighbours.
All base stations begin with almost equal preference for all frequencies. The small
inequalities are due to the presence of noise. Over the course of time, as the nego-
tiations continue, they abandon most frequencies (due to inhibition by their
neighbours) while increasing their 'preference' for a few frequencies.
This approach was originally tested in a small simulated network based on the
Cellnet network in East Anglia circa 1995 ([7] and see Fig. 2, C, although 29 fre-
quencies, rather than the four shown were available to be allocated, and each base
station required more than one frequency). The benefits of the method are exactly
those which would be anticipated due to its developmental inspiration. It provides
dynamic, robust solutions which continue to meet the demand for calls even as
traffic fluctuates in the network. Because the method is inherently distributed,
placing the processing load at the base stations rather than at a central controller, it
is able to run continuously online and its performance does not suffer as the net-
work expands.

4.1.2 Applicability
The method inspired by development differs markedly from current practice in
frequency allocation. Currently the network operators use centrali sed planning: at
regular intervals a planning committee meets to decide on a new frequency plan
(according to network performance measures since the last such meeting). This
plan is then disseminated to the base stations, and will remain in place until the
next meeting. The current method works best when networks are relatively smaIl
and there is little fluctuation in traffic. The larger the network, and the greater the
fluctuations, the harder it becomes for a centrali sed solution strategy to continue to
de1iver acceptable quaIity of service.
Cells in Telecommunications 17

Fig. 2. A The mutual inhibition process which Iimits the number of cells adopting the neu-
ral 'bristle-making' fate. Many ha ve the potential (grey) but only a few reali se that potential
(black) while inhibiting their neighbours (white). B The pattern of bristIes that results in the
adult fly. CA map of East Anglia in the UK, with the coverage areas of base stations in a
mobile phone network coloured in as if there were only four frequencies to allocate (blue,
red, green, yellow)

In effect the efficiency with which the resource (frequencies) is being used de-
clines because that resource cannot be moved meet demand. As a result the opera-
tor must invest in constructing new base stations (Le. buying more resources) to
maintain quality of service. It is in exactIy these circumstances of dynamic traffic
in a large network that the method inspired by development is likely to deliver
tangible benefits. This work is now being developed under contract to QinetiQ for
the UK Ministry of Defence. It is hoped that it can be usefully applied to manag-
ing frequencies for military radio communications. Battlefield communications
are an excellent example of the kind of network in which this approach delivers
the greatest advantage over centralised control methods. They are complex and
highly dynamic, with even the base stations moving around from minute to min-
ute. In addition the Iines of communication must continue to operate effectively
even when many base stations are no Ion ger functioning.

4.2 Design by Morphogenesis

Morphogenesis means 'creation of shape'. It is the process of ceH division and


movement which produces the adult organism from the original single egg ceH.
Producing the correct pattems of ceH morphology and function, which will func-
tion as an adult organism, requires cells to change their behaviour in response to
their local environment. Changes in gene expression underlie changes in behav-
18 R. Tateson

iour, but Design by Morphogenesis [8] is not intended to simulate behaviour at


this level (CelISim, below, attempts this, and see also [9] for prior work). The
'celIs' in Design by Morphogenesis are controlled by an atificial neural network,
but have the ability to respond to their environment, and modify it in their turn, by
emitting chemicals which may be perceived by their neighbours.
Design by Morphogenesis (DBM) is unusual in that it seeks to combine a faith-
fuI simulation of aspects of natural morphogenesis with a useful design methodol-
ogy. As artificial life techniques are increasingly applied to real industrial prob-
lems, the issue of how to 'engineer' emergent behaviour will become ever more
pressing. By providing a process for translating an evolvable 'genome' into a
'phenotype' of measurable fitness we believe DBM has the potential to allow the
'engineering' to escape from the low-Ievel setting of neural network weights,
hopefulIy to occupy a higher-Ievel 'nurturing' role.
The DBM work was motivated by two complementary desires. Firstly to focus
on the process of morphogenesis in the spirit of simulation. In other words to try
to demonstrate the kind of form creation achieved by a fertilised egg. Secondly to
exploit morphogenesis as part of the design process for artificial systems. At root
these aims are very much part of evolutionary computation: we believe that evolu-
tionary design is effective, we assert that the morphogenetic 'mapping' from geno-
type to phenotype is important, therefore we aim to mimic nature's morphogenesis
and we seek ways to apply that process to human design problems.

4.2.1 Simulation
A DBM individual consists of a single celI, with a genome, which is placed at the
centre of a 3D space and alIowed to divide over time to give a cluster of celIs.
Each cell has a small range of behaviours available to it. The genome determines
both the nature of the chemicals which the cell can produce, and the way in which
the cell responds to the chemicals in its immediate environment. During the life-
time of one individual, the chemicals produced by the original cell and all its
daughters can diffuse and react within the 3D space. Hence the individual is con-
stantly altering and responding to its chemical environment. There are several
such individuals in a population. Individuals are evaluated according to a fitness
function based on the positioning of the celIs of the cluster. Fitter individuals are
used as parents to produce genomes for the next generation.

4.2.2 Results
To date DBM as been applied to generating the shapes and differing cell identities
of early embryogenesis. In other words the design requirements are also derived
from developmental biology, rather than an analogous engineering problem. The
authors were interested in generating shapes resembling the blastocyst and seg-
mentation along an axis.
The blastocyst is a holIow ball of celIs produced at a very early stage of verte-
brate development. The infoldings of the blastocyst subsequently produce layers
of cells which differentiate to give the tissue types of the developing animal. It is
challenging to produce a holIow baII from a simulation of ceH division because the
Cells in Telecommunications 19

tendency of such a simulation is to produce a solid ball, in which daughter cells


cluster around the original central ceH. It is also something which the simulation
must be able to achieve, since we would only have confidence in applying DBM
to engineering design goals when such an approach can demonstrate its ability to
produce elementary biologic al morphologies.
Using an appropriate fitness function it was possible to evolve DBM individu-
als which were more 'blastocyt-like' than their ancestors. Fig. 3 shows an example
of one such individual

Fig. 3 An example of a DBM individual evolved to produce a blastocyst-Iike morphology.


The initial single cell has divided and the resulting cells have moved outwards from the 0-
rigin. Although it does not look Iike a neat spherical baII of cells, this individual does score
weB in terms of reproducing some of the basic features of the blastocyst, and is markedly
better than random non-evolved individuals

Segmentation is a common theme in the development of multicellular organ-


isms. The 'stripes' of gene expression along the anterior-posterior axis of the de-
veloping fruitfly are a famous example. The repeated segments can then undergo
further subdivisions, and differentiate one from another. To evolve a segmented
morphology in DBM a fitness function was used which rewards dense clusters of
cells separated by empty space. In a real embryo this would correspond to clusters
(or 'segments') of ceIls with cell identity 'A' interspersed with ceIls with identity
'B'. Individuals do evolve which meet the fitness criteria. Two examples are
shown in Fig. 4. Again the morphology does not resemble a neatly formed embryo
but it is a significant improvement over random, unevolved individuals and is
clearly responding to the fitness pressure.

4.2.3 Applicability
DBM is certainly at an early stage but it has the potential to be very useful as an
automated design tool, most obviously in the telecommunications domain. The
networked nature of many telecommunications problems, ranging from the design
of physical networks to the services provided by those networks, could be weIl
served by the division-based process of DBM. It would be easy, for example, to
imagine the DBM 'cells' as nodes in a network which assume their final position
according to the influences of their neighbours, or even dynamically alter their be-
20 R. Tateson

haviour based on the 'chemical' environment around them. Looking further to the
future, it is possible to envision DBM-type ideas being used to design the autono-
mous software agents which can move across networks providing services to their
'owners'. Adaptivity and specialisation without explicit instructions from the
owner might make such agents far more useful. In summary, DBM has shown that
biological-style morphogenesis can be simulated in a way which preserves the
useful design power of that process and we expect the telecommunications prob-
lems of the future to be of the type which will benefit from this approach.

Fig. 4. 11- and 25-cell individuals evolved for segmentation. In these examples the indi-
viduals have divided into two c\usters. The images are oriented so that the axis along which
they have 'segmented' runs horizontally across the page

4.3 CeliSim

CellSim is a simulation of single-celled organisms in a simple spatial world.


The focus of the simulation is on the internal workings of the cells. In particular
each 'cell' has a 'genome' which specifies the 'proteins' which could be synthe-
si sed by that ceH. The rate at which different proteins are actually being synthe-
sised by the cell at any instant depends on the 'concentrations' of endogenous and
exogenous proteins in the cell 'cytoplasm' . Of course, synthesis of new proteins
will alter these concentrations, as will the binding of extern al ligands to receptors
on the surface of the cell. Thus the expression state of the genome is constantly
being affected by the phenotype of the ce Il, and the phenotype is constantly
changed by expression of new proteins. The intention is to use this system to in-
vestigate, in the ftrst instance, the effects of complex, dynamic internal control
networks on stability of function and evolvability. Ultimately it is hoped to use the
system as the basis of a multicellular morphogenesis simulator.
CeIls in Teleeommunieations 21

Genome

Gene more less func modi act,

/ \
10010111]

Fig. 5. The genome strueture of the simulated eel!s. Eaeh cel! has a 400-bit genome. This
genome is divided into ten genes, eaeh with 40-bits. Eaeh gene is divided into five domains,
eaeh with 8-bits. These domains are ealled 'more', 'less', fune', 'modi' and 'act'

4.3. 1 Real Bacteria


Bacteria are single-celled organisms, usually smaller than eukaryotic cells and
with a simpler intemal structure. The single circular chromosome carries all the
essential genes for cell behaviour and metabolism, although there may be addi-
tional small chromosomes - 'plasmids' - carrying genes for antibiotic resistance
or mating type. A lipid membrane and a rigid cell wall surround the cells, provid-
ing structural strength and a barrier to the free passage of alI but the smallest
molecules. This allows the cells to maintain an intracellular environment which is
very different from the external environment. Molecules can be selectively im-
ported or exported and the osmolarity of the cell can be different from the envi-
ronment.
The 'decisions' made by an individual bacterium depend ultimately on the dy-
namic state of the genome. In some cases, such as a 'decision' to switch from us-
ing glucose as an energy source to using lactose, the expression state of the ge-
nome must change. New genes must be expres sed and others repressed. In other
cases the 'decision' can be made without a change to gene expression because it
relies on interactions between previously synthesized proteins. The chemotactic
behaviour of some motile bacteria is an example of this: the bacteria sense and re-
spond to differential chemical concentrations in their environment allowing them
to home in on food and avoid noxious chemi caIs.

4.3.2 Simulation
CellSim simulates the gene expression-based control of cellular behaviour
(Fig. 5). It does not simulate internal cell structure or any limitations on diffusion,
so any 'molecule' can instantaneously interact with any other. CellSim also does
not simulate molecular structure: alI interactions between simulated molecules are
based on 'soft' complementary matching between bit strings (a '1' in string A will
bind to a 'O' at the corresponding position in string B, but 'soft' matching means
22 R. Tateson

that there will be non-zero binding affinity between two strings even if the com-
plementary match is not perfect at alI positions).
The 'more' and 'less' domains determine the 'expression level' of the gene, i.e.
the rate of production from that gene. The 'func' domain dictates the nature of the
product, specifying whether the expressed string will be released from the cell into
the environment, retained within the cell or tethered in the membrane of the cell,
with either its 'modi' or 'act' domain exposed to the external environment. 'func'
also specifies the qualitative effect of binding by another string to the 'modi' do-
main of this string: it can result in an increase or a decrease in the effective activ-
ity of the 'act' domain. 'modi' and 'act' specify the sequence of the product.
The strings already present in the intracellular environment control the expres-
sion levels of the genes by binding to the 'more' and 'less' domains (Fig.6). The
'occupancy' of these domains is the sum of the binding affinities of all existing in-
tracellular strings to the domain. The greater the 'occupancy' of the 'more' do-
main, the greater the expression level. Conversely, the greater the occupancy of
the 'less' domain, the lower the expression level.

Genome

more less func modi act


Gene

nisali act
Soup Strings nisali act

bind nisali act


nisali act

New Soup Strings


Are 'expressed'
Fig. 6. Gene expres sion is related by the strings which are present in the ceH. These may
have been expressed from the ceH's own genome, or have entered the ceH from the envi-
ronment

In addition to binding to the gene domains, the intracellular strings can bind to
each other. The 'act' sequence can bind to the 'modi' sequence of other strings,
and hence alter the effective activity of the string, increasing or decreasing it as
specified by 'func'.
Thus a feedback control network is established within the ceH (Fig. 7), with the
pool of intracellular strings created by, and acting on, the genome. This internal
'genetic regulatory network' is linked to the external environment in two direc-
tions. Firstly strings in the environment can affect the cell by crossing the mem-
brane to bind intraceHular 'modi' domains or by binding to externally facing
'modi' domains. Secondly the ceH may alter the environment by exporting strings
and by exhibiting behaviours such as movement.
Cells in Telecommunications 23

Act domain

Stri

Fig. 7. A Cell in a CellSim world. The long dark strand represents the genome, from which
new strings are synthesized, under the regulation of strings already present in the ceH.
Strings synthesized from the genome have two 8-bit domains, 'modi' (shown dark) and
'act' (shown Iight). The 'fune' domain of a gene determines both the effect on 'act' if an-
other string binds 'modi', and the position of the string in the eell (free in the ceH, anchored
in the membrane with 'modi' ar 'act' exposed to the exterior, ar secreted from the ceH [not
shown])

4.3.3 Results and Applications


'CeIlSim' has been tested in a simple two-dimensional simulated environment
with a 'food source' creating a standing concentration gradient of food. The
'CeIls' are provided with a genome as described above, but in addition they are
given a 'motor', intended to be a simple simulation of the bacterial flagellar mo-
tor, which has two states: 'on' and 'off. When the motor is on, the cell moves in a
straight line at constant speed. When the motor is off, the ceH performs a random
walk. This is intended to be analogous to the famous swimming and tumbling be-
haviours of chemotactic bacteria. Since both the food and the switches on the mo-
tor which determine its state are represented as bit strings, it is possible to produce
a control circuit using the 'genes' and strings to connect the presence of food to
switching of the motor. Simulated cells have been engineered with a control cir-
cuit which gives chemotactic behaviour. In other words the ceIls swim up the food
concentration gradient.
By setting such a simulation in an evolutionary context, where cells have off-
spring which inherit their genes (with variation), it is possible to attempt to evolve
control circuits rather than engineer them. This was done by taking a population of
cells with an engineered chemotactic circuit (Fig.8) and then randomising one of
the genes involved in that control. The cells were then released into the food envi-
ronment. Food was accumulated passively by the cells, at a rate proportional to the
concentration at their current position, and when a threshold was reached the ceH
24 R. Tateson

divided. Thus cells with some residual chemotactic ability were implicitly re-
warded because they would accumulate food (on average) faster than their com-
petitors. Over several generations chemotactic behaviour was 're-discovered' by
the cells, and as one would expect this behaviour became fixed in the population.

Food = 1010 1010

~MOd; Aci

Gene 1 - - - - 010 10101 01010101

Gcn 2 _ _ _ _ 1010 1010


~00000000

Genc 3 111111 11

Moloc mocc-~lli i JJJ/


L
Motor less =00000000
Fig. 8. A CeHSim control structure which aHows the simulated cells to move up a 'food'
concentration gradient and hence accumulate food faster. AII three genes in this case have
'func' domains which specify that higher occupancy of their 'modi' domain will increase
effective activity of their 'act' domain. The 'food' sequence is a feature of the environment,
not coded in the genome, and diffuses freely into the cell (the effect would be the same if
the food could not enter the ceH bu the 'modi' domain of gene 1 was exposed to the exter-
nal environment and hence able to bind to the food). Binding of food to gene 1 increases
the effective activity of gene 1's 'act' domain. This binds to gene 2's 'modi' domain and
increases gene 2's 'act' activity. Gene 2's 'act' domain directly stimulated the 'motor more'
sequence, which promotes straight swimming by the ceH. It also stimulates gene 3, whose
'act' domain stimulates 'motor less' and promotes tumbling by the ceH. Because the simu-
lation of the 'motor more' by gene 2 leads to counteracting stimulation of 'motor less' by
gene 3, the control circuit allows 'motor more' stimulation to outweigh 'motor less' stimu-
lation while the concentration of food is rising. When food concentration in the cell's im-
mediate environment starts to fali the trailing stimulation of 'motor less' by gene 3 pre-
dominates and the cell will tend to tumble. Overall this leads to gradient-c1imbing
behaviour by the cell

4.3.4 App/icability
CellSim is complementary to Design by Morphogenesis (above), concentrating on
the issue of gene expression, while largely ignoring the physical reality of chemi-
cal diffusion. It shares one possible area of application with DBM: software agents
Cells in Telecommunications 25

moving around a network could in principle re-configure their internal control


structures in ways which implicitly match the environment. If finding particular
pieces of data allows the 'CelISim Agents' to proliferate, they should leam to do
more of that activity. In the nearer term, CelISim is more likely to have a role in
adding to the field of evolutionary computation, particularly where it overlaps
with 'artificiallife'.

5 Conclusion

There is great potential for celIular biology to inspire and influence design, engi-
neering and control of complex systems. Telecommunications is a prime area for
the early realisation of this potential because it combines complexity and compu-
tation to give systems which are physically distributed but must respond dynami-
calIy and appropriately to user behaviour. The ability of telecommunications sys-
tems to seamlessly accommodate growth and 'heaI' when damaged is also very
desirable and is well addressed by celI-based approaches.
However, it is not yet possible to point to an example of this thinking which has
been implemented in a real system and demonstrated its effectiveness as a day to
day piece of a functioning telecommunications system. The examples used in this
chapter are alI at the research stage. The fruitfly example is the closest to reali sa-
tion and could conceivably be put into practice within 3 years. The DBM and
CelISim examples are both further from direct application, and currently address
more abstract and general aspects of telecommunications problems. It is most
likely that the cellular thinking in these IaUer two examples will be incorporated
into mainstream telecommunications by offering a new dimension to established
evolutionary computation approaches. In a sense both offer new ways of address-
ing the old problem of escaping local hilIs to climb the optimality mountain.

Acknowledgements. Many thanks to ali members of the Future Technologies


Group at BTexact. In particular, thanks to Cefn Hoile for the 'Design by Morpho-
genesis' example, and to Morag Gray for help with the CelISim 'chemotaxis'
simulation. AlI of this work was funded by the BTexact long-term research
budget. Thanks to participants in the CytoCom network sponsored by the EPSRC,
in particular to Ray Paton for making it happen and keeping it happening.

References

Bonabeau, E., Dorigo, M. and Theraulaz, G. Inspiration for optimization


from social insect behaviour, 2000. Nature 406, pp 39 - 42
2 Holland, J. H., Adaptation in natural and artificial systems, 1992. MIT Press,
Cambridge, Mass.
26 R. Tateson

3 Băck, T., Hammel, U., and Schwefel, H-P., 1997. Evolutionary computation:
comments on the history and current state, IEEE Transactions on Evolution-
ary Computation, Val. 1, No. 1
4 Assumu, D. and Mellis, J., GenOSys-automated tool for planning copper
greenfield networks, 1999. British Telecommunications Engineering, vo1.17,
pt.4, pp 281-90
5 Tateson, R., Shackleton, M., Marrow, P., Bonsma, E., Proctor, G., Winter, C.
and Nwana, H., 2000. Nature-inspired computation: towards novel and radi-
cal computting, BT Technology Journal (Millennium Edition) l8... No. 1, pp
73 - 75 plus CD content
6 Artavanis T., S., Matsuno, K. and Fortini, M. E., 1995. Notch signalling. Sci-
ence voI 268 pp 225-232
7 Tateson, R., 1998. Self-organising pattern formation: fmit flies and ceH
phones in A. E. Eiben, T. Back, M. Schoenauer and H-P. Schwefel Proceed-
ings of 5th Int. Conf PPSN, Springer, Berlin. Heidelberg New York, pp 732 -
741
8 Hoile, C. and Tateson, R., 2000. Design by morphogenesis, BT Technology
Journal ~ No. 4, pp 85 - 94
9 Eggenberger, P., Evolving morphologies of simulated 3D organisms based on
differential gene expres sion, 1997. In Proceedings of Fourth European Con-
ference on Artificial Life. pp 205-213. P. Husbands and 1. Harvey (editors),
MIT Press, Cambridge, Mass.
10 Paton, R. (ed.), Computing with biologic al metaphors, 1994. Chapman and
HaU, London
11 Vogel, S., Cats' paws and catapults, 1999. Penguin, London
Symbiogenesis as a Machine Learning
Mechanism

L. Bull, A. Tomlinson

FacuIty of Computing, Engineering and Mathematical Sciences,


University of the West of England, Bristol, BS 16 1QY, UK
larry.bull@uwe.ac.uk

Abstract. Symbiosis is the phenomenon in which organisms of different species


live together in close association, potentially resulting in a raised le vei of fitness
for one or more of the organisms. Symbiogenesis is the name given to the process
by which symbiotic partners combine and unify - forming endosymbioses and
then potentially transferring genetic material a- giving rise to new morphologies
and physiologies evolutionarily more advanced than their constituents. This proc-
ess is known to occur at many levels, from intra-cellular to inter-organism. In this
chapter we begin by using the abstract NKCS model of coevolution to examine
endosymbiosis and its effect on the evolutionary performance of the entities in-
volved. We are then able to suggest the conditions under which endosymbioses
are more likely to occur and why; we find they emerge between organisms within
a window of their respective 'chaotic gas regimes' and hence that the association
represents a more stable state for the partners. This general result is then exploited
within a machine leaming architecture to improve its performance in non-Markov
problem domains. That is, we show how symbiogenetic mechanisms found at the
cellular level can be successfully applied to computationalleaming.

1 Introduction

Symbioses are commonplace in the natural world and it is therefore argued that
the phenomenon is of great evolutionary significance (e.g. [1 D. Symbiogenesis is
the hypothesis that, if the relationship between symbionts evolves in the direction
of increasing dependency, potentially 'a new formation at the level of the organ-
ism arises - a complex form having the attributes of an integrated morphophysi-
ological entity' [2, p.Sl. In a more restricted sense symbiogenesis refers to the
formation of hereditry endosymbioses, the symbiotic association in which partners
exist within a host partner and the relationship is maintained in offspring. Perhaps
the most remarkable example of symbiogenesis may have occured during the evo-
lution of eukaryotes whereby free-living bacteria appear to have become the intra-
cellular organelles (see [2] for an account of the history of this concept). Maynard-
Smith and Szathmary [3] have a1so argued that symbiogenesis gave rise to chro-
mosomes during the evolution of cellular structures.
28 L. Bull, A. Tomlinson

In this paper we begin by examining endosymbiosis, comparing its evolution-


ary progress to the equivalent association where the partners do not become so
closely integrated. We use a version of Kauffman and Johnsen's [4] genetics-
based NKCS model, which allows the systematic alteration of various aspects of a
coevolving environment, to show that the effective unification of organisms via
this 'megamutation' [5] will take place under certain conditions. Kauffman and
Johnsen used this model to examine the dynamics of heterogeneous coevolution.
Initially in our model potential endosymbionts exist as cooperating symbionts
evolving within their respective separate populations, i.e. as heterogeneous spe-
cies. During simulations we apply a megamutation operator to the members of
these separate populations, causing them to become hereditary endosymbionts,
forming their own sub-population, from within which they then evolve. The
members of this sub-population compete against each other and members of the
original cooperating symbiotic species for existence. Our results indicate that the
successful formation of an endosymbiotic sub-population occurs when the part-
ners are, to use Packard's [6] terminology, away from their idealliquid regimes -
their edges of chaos - and are within their respective chaotic gas regimes. Endo-
symbiosis can therefore be seen as a mechanism by which units of selection may
stabilise the environment in which they evolve.
The results from the abstract NKCS model are then used to improve the per-
formance of a machi ne leaming architecture which uses a coevolutionary mecha-
nism - Leaming Classifier Systems (LCS). LCS primarily use genetic algorithms
(GA) [8] and the bucket brigade algorithm to produce an interacting ecology of
production system rules for a given task. Results indicate that a simple megamuta-
tion operator causing rule-linkage does not lead to improved performance. Encap-
sulation is then added to the linked rule-sets such that other rules outside of the
cellllinked complex cannot share in their functiona1ity; cheats that can potentially
exploit the existence of the cooperative structures are excluded. This is shown to
lead to improved performance, particularly in more difficult tasks containing am-
biguous (on-Markov) inputs.

2 Simulated Symbiogenesis

2.1 The NKCS Model

Kauffman and Johnsen [4] introduced the NKCS model to allow the genetics-
based study of various aspects of coevolution. In their model an individual is rep-
resented by a genome of N (binary) genes, each of which depends epistatically
upon K other genes in its genome. Thus increasing K, with respect to N, increases
the epistatic linkage, increasing the ruggedness of the fitness landscapes by in-
creasing the number of fitness peaks, which increases the steepness of the sides of
fitness peaks and decreases their typical heights. 'This decrease reflects conflict-
ing constraints which arise when epistatic linkages increase' [4 p.330]. Each gene
is also said to depend upon C traits in each of the other species S with which it in-
teracts. The adaptive moves by one species may deform the fitness landscape(s) of
Symbiogenesis as a Machine Leaming Mechanism 29

its partner(s). Altering C, with respect to N, changes how dramatically adaptive


moves by each species deform the landscape(s) of its partner(s). Therefore we can
adjust the strength ofthe symbionts' association in a systematic way.
The model assumes an inter- and intra-genome interactions are so complex that
it is only appropriate to assign random values to their effects on fitness. Therefore
for each of the possible K +(Cx S) interactions, a table of 2'K+(C,SI+/) fitnesses is cre-
ated for each gene, with ali entries in the range 0.0 to 1.0, such that there is one
fitness for each combination of traits.The fitness contribution of each gene is
found from its table. These fitnesses are then summed and normali sed by N to
give the selective fitness of the total genome (the re ader is referred to [10] for fun
details ofboth the NK and NKCS models).
Kauffman and Johnsen's basic model uses populations of one individual (said
to represent a homogeneous species) and allele mutation to evolve them in turn.
That is, if a given mutant is found to be fitter than its parent in the current context
of the other species, that species as a whole moves to the genetic configuration
represented by the mutant. This is repeated for alI species over a number of gen-
erations. In this paper we apply a generational genetic algorithm to their model,
slightly altering some aspects; the species evaluate and evolve at the same time
and do so within populations of many individuals. This alIows the appropriate as-
sociation between the symbionts to emerge. However, this does not appear to
cause the loss of any of the dynamics Kauffman and Johnsen report. They show
how both inter- (C) and intra-genome (K) epistasis affects a coevolving system,
particulary in the attainment of Nash equilibria Ca combination of actions by a set
of agents such that, for each agent, granted that the other agents do not alter their
own actions, its action is optimal' [10, p. 245]). We will return to their results
later.

2.2 Genetic Aigorithm Simulation

In this paper there are three species (A, B, and E), two of which (A and B) are said
to be in a beneficial symbiotic relationship and have the potential to form an he-
reditary endosymbiosis. The third species (E) is said to represent the other species
with which the symbionts coevolve - their ecologically coupled partners. For A
and B there are three sub-populations (Fig. 1). There are the two sub-populations
for the symbionts living cooperatively and evo1ving separately, each receiving
there own separate measure of fitness, with these being initially set to size P (the
sizes of these two populations are always the same as each other). The other sub-
population is of hereditary endosymbionts, consisting of individuals carrying the
genomes of both symbionts, where they are treated as 'an interspecies supraorgan-
ism' [11] for selection by receiving a combined fitness. The size of this sub-
population is initially set to zero, but during the course of a simulation individuals
can move from the other sub-populations to this, and back, via a megamutation
operator (hence there are always effectively P sets of the two genomes overall).
Over evolutionary time the most appropriate configuration of the two symbionts
of the model - be it an endosymbiosis or to stay as two separate populations of
cooperating species - will emerge rather than being prescribed a priori; the he-
30 L. Bull, A. Tomlinson

reditary endosymbiotic version of a given two-species symbiosis can emerge and


compete against the separate but cooperating vers ion of the association for popula-
tion 'space'.

A B E A B E
o o

p p A&B

Fig. 1. During evolutionary time the population space of the symbiotic species A and B can
be invaded by their equivalent endosymbiotic association (A&B)

Initially the two symbiotic species exist in separate populations, both of size P,
where they are paired simply by taking the corresponding members from their re-
spective populations (that is speciesA[x] is paired with speciesB[x], where
O.sx<P). A corresponding member of the environmental third species (speci-
esE[x]) is also chosen. Individuals are then evaluated on their given NKCS func-
tion and awarded their own separate measure of fitness. A generational GA is
then TUn on the total population space of the two symbionts. At the end of this
process, and each succeeding evaluation-selection-generation of individuals, a
check is made to see if the symbionts formJdissolve a more intimate association.
The formation of a general symbiosis in nature is termed recognition and 'may in-
clude specific chemi cal interactions, tolerance or suppresion of host defences, and
metabolic, morphological or behavioural interactions' [12 p.S]. For our purposes
the formation of an endosymbiosis is established by simply testing a probability
(megamutation) P" Satisfaction of this probability means both partners form an
hereditary endosymbiosis, where they receive a combined measure of fitness for
selection and where an individual consists of two genomes - the specie sA and
speciesB members. From then on they do not evolve within seperate populations,
but are treated as one by the GA , having been joined via this megamutation op-
erator. The same operator is also applied to the population of hereditary endosym-
bionts, working in reverse, to stop the drift of members away from the population
of separate symbionts; separating is just as likely as joining. Evaluation of the two
endosymbiotic sets of genes is the same as when they were separate; the specie sA
genome is evaluated using the NKCS fitness tables for specie sA and the speciesB
genome with the speciesB tables.
The size of these competing symbiotic sub-populations (endo and separate) is
controlled by a version of the roulette wheel proportional selection used in many
genetic algorithms. Here the total fitness ratings of aII individuals in aII symbiotic
sub-populations are summed at the end of each generation. Then a random number
is chosen between zero and this total fitness. Starting with any given sub-
Symbiogenesis as a Machine Leaming Mechanism 31

population the scores of alI individuals are again summed until this value is equal
to or greater than the random number chosen. The following production of an off-
spring uses gene mutation on the selected single parent. This whole process is re-
peated until P new pairs of individuals have been created, from whichever sub-
populations. It may have been noted that the cooperating separate species receive
individual fitnesses, meaning they are on average half as likely to be chosen com-
pared with the endosymbionts (supraorganisms) through the method of selection.
Therefore we say that the separate individuals' sub-population appears as one,
with the combined fitnesses of speciesA and of speciesB during the summing
processes. Whenever this sub-population is chosen a pair of separate symbionts
are always created, proportionalIy, from their own populations. The environmental
species (E) also evolves asexually within its own population via a generational
GA, using proportional selection and allele mutation. This population allows us to
examine the effect endosymbioses have on their other ecological partners. The
process is repeated for a given number of evolutionary generations.
We now use the model described above to examine the evolutionary perform-
ance of endosymbiosis, with alI results presented being averaged performance
over 20 runs.

2.3 Results

We implement our version of Kauffman and Johnsen's model using three species,
each evolving separately in populations of 100 (P=100) individuals. Three ge-
nome lengths are used throughout, N=8, 12 and 24 - alI experiments are repeated
for the various N. We introduce a second value of interdependence for the symbi-
onts to the third species (Ce); we can alter both the interdependence between the
symbionts (C) and the interdependence both symbionts have with their environ-
ment (Ce). This rate of environmental interdependence is the same for the symbi-
onts whether or not they are joined. The per bit allele mutation rate of the GA (p,,,)
is set at 0.001 and the megamutation operator (p,) is set at 1/(5P) - that is on aver-
age two sub-population members will alter their association once in five genera-
tions (it is suggested that the probability is comparitively low in nature with re-
spect to the gene allele mutations). AlI species have the same K value.
A number of experiments were carried out in which the values of K, C and Ce
were varied. We find that for various combinations of these values different sub-
populations become dominant within the model.
For Ce=1 (Le. low) and any K we find that increasing C increases the popula-
tion percentage of the hereditary endosymbionts, such that when CSK there are
more endosymbionts than separate cooperating symbionts. The greater C is in re-
lation to K, the bigger the difference in population percentage (Figs. 2 and 3).
Conversely when K>C the cooperating separate symbiotic species are dominant
(i.e. >50%), though they never dominate as much as the endosymbionts do. This
shows that, as the degree of dependency increases between symbionts, forming an
hereditary endosymbiosis becomes increasingly beneficial to them. Also note from
Figs. 2 and 3 that the difference between K and C must be large for either associa-
tion to become significantly dominant (>80%). Kauffman and Johnsen [4] state
32 L. Bull, A. Tomlinson

that K=C corresponds to the coevolving partners' optimum edge of chaos regime
- their liquid state. It can be seen from our results that in these models the edge of
chaos has quite a large basin of attraction towards coexistence of both associa-
tions; conditions must stray considerably before the population will diverge sig-
nificantly either way.

al.O

ICI,O

G--i> ......

::~~~I ~
O.OM
lOOO.O

--
i&
EO.O

<0.0

d°o.6:0-~lOOO::c!:-:O~--::«lOO~J>~-IOOI)~,o:--'--::e:c!·
ooo.o
90'-
Fig. 2. Graphs showing that as organisms' attributes K and Care varied the ratio of separate
symbionts to endosymbionts in a population varies. When K > C the separate symbionts
remain dominant. lncreasing C with respect to K increases the population percentage held
by the endosymbionts

Increasing Ce does not appear to alter the general result of greater C giving rise
to more endosymbionts, although from Figs. 3 and 4 it can be seen that an increase
in Ce can allow a higher percentage of endosymbionts to exist for lower C than
before. Conversely for higher C a lower percentage of endosymbionts are found
when Ce is high than when it is low. That is, if the symbionts become increas-
ingly dependent upon other aspects within their environment, forming an endo-
symbiosis is beneficial only up to some point. As this dependence increases (i.e.
when Ce is high), the benefits of joining appear to decrease; if the symbionts' or
endosymbionts' dependence upon other species within their environment in-
creases, the benefits of either association are decreased and the system retums to
Symbiogenesis as a Machine Leaming Mechanism 33

its edge of chaos-like constitution. Burns' [13] suggestion that some endosymbi-
onts no longer experience environmental factors may be a way in which nature has
dealt with this problem, i.e. the effects of Ce are 108t on endosymbionts.

2.4 Discussion

From these simulations it can be seen that as the interdependence (C) between
symbionts increases, hereditary endosymbiosis becomes increasingly dominant
within the overall population; the further the partners are into their chaotic gas re-
gimes, the more likely they are to form an endosymbiosis.

K
3 5 7
I
3 3
C
5 5
C
7 7
9 K
II N=8 Ce=1
N=12Ce=1

135 7
48 41 42 30
3 3 164 59 51 3
C C C 5
5 5 74 63
7 7 71 7

K K K
N=8 Cc=3 N=8 Ce=5 N=8 Ce=7

Fig. 3. Showing the effects of variying K, C and Ce on the population percentage held by
the endosymbiotic association of a given symbiosis
34 L. Bull, A. Tomlinson

IOO.Or;--..-----..---T1i"-..-----,

IIllO 00.0

'-" :: (,0.0
" ro.O .~

"i$ ;;
].
]. ~ -10.0
i5. 40.0

20.0

0 .0 _-~----''--~----I.----'---'
0.0 2000.0 -llXll.O ftXlO.O
;g~ncm'illn

Fig. 4. Showing that as the amount of interdependence with the environment (Ce) is in-
creased, the difference in the two associations' performance decreases

Kauffman and Johnsen [4] investigated the effects interdependence has upon
separate coevolving partners, finding that:
4A: When the partners are least interdependent (C is low) the time taken for the
system as a whole to reach a Nash equilibrium is low, fitness during the period of
pre-Nash oscillations is c10se to the equilibrium fitness and the oscillating fitness
is high.
4B: When the degree of interdependence is increased, eventuallY to being obli-
gate (C=N), they find the reverse is true with an in crease in the time taken for
equilibrium to be reached, with fitness during the oscillations being below the
equilibrium fitness and where this oscillating fitness is low.
The results are attributed to the fact that, as the amount of interdependence is
increased, the effect of each partner on the others' fitness landscape increases; the
higher C, the more landscape movement. This represents an extension to Kauff-
man' s single organism NK model, in which he found that:
4C: Increasing epistasis (K) with respect to N increases the ruggedness of a fit-
ness landscape, but where the height of the optima decreases.
We can use Kauffman's findings to suggest underlying reasons for our result
above, that as interdependence increases between symbionts the relative perform-
ance of an endosymbiotic association increases. Hereditary endosymbionts repre-
sent two or more genomes being carried together effectively as one genome and as
a consequence the partners' inter-genome epistasis becomes intra-genome epista-
sis for a larger genome. The effects of intra-genome epistasis are very different
from inter-genome epistasis, as can be seen from 4A-4C above. That is, the com-
bined landscape of an hereditary endosymbiosis does not move due to the symbi-
onts' interdependence C - but is more rugged, and larger, than that of the individ-
ual partners living separately (4C). There will still be movement from any
environmental interdependences of course, but now their internal epistasis has in-
creased and they have (potentially, depending on Ce) moved into their combined
form's solid state (K>C). This can be seen as a genetics-Ievel analogy of Burns'
Symbiogenesis as a Machine Leaming Mechanism 35

[13] suggestion that closer associations fonn between symbionts as a response to


environmental unpredictability.
Thus for separate partners when C is low the amount of possible landscape
movement is likewise low. From 4A it can be seen that separately coevolving
symbionts oscillate with a high fitness under these circumstances. Therefore any
hereditary endosymbionts must find at least as high a fitness in their larger, more
stable, more rugged fitness landscape in order to survive in succeeding genera-
tions. They are also hindered by the fact that there is little time for them to search
their larger space - even if the fitness level they find is comparable to that of sepa-
rate individuals (which is more likely to be a local optimum for them anyway) -
before the separate symbionts reach a higher Nash equilibrium. Thus the evolu-
tionary benefits of an hereditary endosymbiosis in such cases are low because the
separate symbionts can quickly find a high fitness level and then quickly reach
equilibrium.
When C is high the hereditary endosymbionts' combined landscape will be
more rugged, containing many low optima (4C). In this case the oscillatory fit-
ness of the coevolving separate symbionts is low (4B) - they are well into their
gas regimes (C>K). Any hereditary endosymbionts will easily find optima in their
more stable but rugged landscape and if that peak is comparable to, or better than,
the low oscillatory fitness of the separate partners, they will survive into the suc-
ceeding generations. The time taken for the separate symbionts to reach equilib-
rium is longer here (4B), which also allows the larger endosymbionts greater
searching time.
Kauffman [lOstates that as K approaches its maximum (N-l), an organism ex-
periences a 'complexity catastrophe' whereby achievable fitness actually de-
creases as the height of optima in its fitness landscape fans to the mean height
(0.5). Intuitively therefore the fonnation of an hereditary endosymbiosis will not
always represent an optimum strategy for the partners over a longer time scale.
When C is high, the resulting K of the endosymbiotic supraorganism will also be
high; intergenome interdependence becomes intragenome epistasis as described
above. We ran the same experiments on a version of Kauffman and Johnsen's
hill-climber NKCS model and found that this was indeed the case (e.g. Fig. 5).
Here the two separate symbiotic species were represented by two individuals, as in
Kauffman and Johnsen's work, and the initial endosymbiotic version of the asso-
ciation was represented by an individual carrying both their initial genomes; in
these simulations the existence of an endosymbiotic association was pre-
detennined and there was no transfer between the two associations during the
simulations. The environmental species was also included (as a single individual).
An species were evolved using Kauffman and Johnsen's 'fittest mutant' strategy,
except that this was done concurrently as in our population-based GA models. We
found that for the endosymbiotic version of the symbiosis to represent the most fit
association over longer time scales a 'window' of symbiotic organism attrlbutes
exists. Roughly, when C is larger than K and the sum of these two values is less
than the genome length (when C>K and C+K<N), the endosymbiosis represents
an evolutionarily fitter association . This window of attributes represents an area
covering half the organisms respective chaotic gas regimes. Fonning an endo-
symbiosis moves the combined partners into a more solid regime as the effects of
36 L. Bull, A. Tomlinson

their interdepence are lost and they become a supraorganism with high epistasis. If
this new epistasis is large with respect to the new genome length (K>NI2) it is
more efficient in the long run for the symbionts to stay separated. That is, for
highly dependent symbionts (i.e. those in their chaotic gas regimes), it is better to
stay separated if joining via endosymbiosis will make them a supraorganism we1\
into its solid regime. We would expect to find the same kind of dynamics occur-
ring within our population-based GA models if they were left to run for much
longer (or if we enforce niching or alter the selection pressure). Our results cer-
tainly match those using Kauffman and Johnsen's model over a shorter time scale
or when we fix the existence of both associations .

•.
~ 1200.0
'"

IIXX).!) L-~--.JL-~--.J_~--L_~--'
0.0 1000.0 4OOO.U (i(X'Xl.O WOO.C

Fig. 5. Showing that a limit ex.ists on how far into their respective gas regimes symbionts
can be fore endosymbioses to represent the optimal strategy (results using Kauffman and
Johnsen's basic model)

3 Symbiogenesis in Machine Learning

lkegami and Kaneko [14] were perhaps the first to use inherited genetic linkage in
an artificial system. They introduced a 'genetic fusion' operator to link genomes
within a population using a GA to produce strategies for a simple game, showing
that more successfullarger genomes could emerge over time. Goldberg et al. [15]
have applied the concept at the gene level whereby contiguous genes link to avoid
disruption under crossover, akin to the aforementioned process by which genomes
may have emerged [3]. We now describe the use of symbiogenesis within a sim-
ple machine leaming architecture; the process in inc1uded within an inductive
framework to improve its performance.

3.1 ZCS: A Simple Learning Classifier System

Leaming Classifier Systems (LCSs) are rule-based systems consisting of a popula-


tion (ecology) of interacting rules each in the form of a condition-action set. Sys-
Symbiogenesis as a Machine Leaming Mechanism 37

tem utility is assigned by the external environment and distributed to individual


rules through a reinforcement leaming algorithm. New rules are generated via a
genetic algorithm. zes [16] is a 'Zeroth-Ievel' LeS without internal memory,
where the rule-base consists of a number (N) of rules in which the condition is a
string of characters from the ternary alphabet {O,I,#} and the action is represented
by a binary string (# represents a don't care) . Associated with each rule is a
strength scalar which acts as an indication of the perceived utility of that rule
within the system. This strength of each rule is initialised to a predetermined value
termed SO"
Reinforcement in zes consists of redistributing strength between subsequent
'action sets', or the matched rules from the previous time step which asserted the
chosen output or 'action'. A fixed fraction (P) of the strength of each member of
the action set ([AD at each time-step is placed in a 'common bucket'. A record is
kept of the previous action set [AL and if this is not empty then the members of
this action set each receive an equal share of the contents of the current bucket,
once this has been reduced by a predetermined discount factor (y). If a reward is
received from the environment then a fixed fraction (P) of this value is distributed
evenly amongst the members of [A]. FinalIy, a tax (t) is imposed on ali matched
rules that do not belong to [A] on each time step in order to encourage exploitation
of the stronger classifiers.
zes employs two discovery mechanisms, a global ('panmictic') GA and a cov-
ering operator. On each time step there is a probability p of GA invocation. When
calIed, the GA uses fitness proportional (roulette wheel) selection to determine
two parent rules based on strength. Two offspring are produced via mutation
(probability ~) and crossover (single point with probability X). The parents then
donate half of their strengths to their offspring who replace existing members of
the rule-base. The deleted rules are chosen using roulette wheel selection based on
the reciproc al of rule strength. If on some time step, no rules match or alI matched
rules have a combined strength of less than <1> times the rule-base average, then a
covering operator is invoked which generates a new matching rule with a random
action.
The default parameters presented for zes, and unless otherwise stated for this
paper, are: N = 400, S()=20, P= 0.2, y = 0.71, t = 0.1, X = 0.5, ~ = 0.002, p = 0.25,
<1>=0.5
Thus zes represents a 'basic classifier system for reinforcement leaming that
retains much of Holland's original framework while simplifying it so as to ln-
crease ease of understanding and performance' [16]. For this reason the zes ar-
chitecture has been chosen to examine the basic behaviour of classifier systems
with the process of symbiogenesis added. The reader is referred to [16] for full de-
tails of ZCS.
38 L. Bull, A. Tomlinson

3.2 Symbiogenesis in a Learning Classifier System

Wilson and Goldberg [17] were the first to suggest that rule-linkage may help
LCSs form complex rule structures, what they termed rule corporations. Accord-
ing to Goldberg and Wilson, the rule-base of a 'corporate classifier system' (CCS)
would contain not only single rules, but also clusters of rules. These corporations
would only be reproduced or deleted as a unit, hence synchronisation is assumed,
and formed by a mutation type operator. For reproduction, the fitness of a corpora-
tion would be dependent upon the fitness of its members, possibly the average
strength, such that it would be advantageous for rules to link together rather than
remain single. If average fitness was used to determine the fitness of a corporation
then this may be sufficient to encourage corporate linkage. Given the results in
Sec. 2, the increased stability in evaluation environment can be expected to pro-
mote linkage between highly interdependent rules.
Holland [18] has presented the Echo system, an artificial ecosystem simulation
in which agents move from site to site, interacting with each other and local re-
sources. The agents in the system are given the ability to increase in size and com-
plexity as individual agents join 'complex aggregates' or merge together to form
'macro-agents'. The proposed approach to this is to give each simple agent a long
chain of 'tag' strings in addition to its basic chromosome. Some of these strings
will remain dormant in the single agent and will only be activated if the agent
joins to form a higher level structure (triggered by a pattern matching procedure).
This suggests one possible approach to the implementation of corporations within
a classifier system based on the idea of dormant linkage templates.
In ZCS, a ruIe consists of a condition, an action, and also a strength value. In
this work an implementation of a corporate classifier system has been facilitated
by adding a few more parameters.
If corporations are viewed as chains of rules, then a rule can at most be directly
linked to only two other rules. If this approach is taken then each rule will require
two link parameters ('link forward' and 'link back') that when active reference
other rules within a corporation. These links will be initialised as inactive but
when two rules are selected for joining, then one of each rule's links ('link for-
ward' for one rule, 'link back' for the other) will be set to reference the other rule.
This concept of rule-linkage is analogous to the gene-linkage employed by the
afore-mentioned work in GAs. Here, linkage between genes is used to encourage
the formation and propagation of good building blocks in the population of a GA.
In a corporate classifier system rule-linkage is used to encourage associations be-
tween rules through the formation of inter-dependent rule chains.
In addition to this each rule also contains a 'corporate size' parameter and a
'corporate id.' parameter included to facilitate subsequent processing. InitialIy size
is set to 1 and corporate ido is left inactive. Within corporations, alI rules will hold
the same values for size and corporate id, and these are set during the formation of
the corporation, either through 'corporate joining' or through the action of cross-
over by the GA. The classifier system keeps a record of how many corporations
have been formed and this is used to determine the ido reference for each new cor-
poration.
Symbiogenesis as a Machine Leaming Mechanism 39

Initially coupling/linkage occurs panmicticly with random probability on each


time step, in the same manner as the GA. An initial coupling probability of 0.25
(once every four time steps on average) was decided on but exhaustive testing is
required to determine an optimum rate. This optimum rate is likely to be depend-
ent on such factors as rule-base size, GA activity and the nature of the task to be
leamed.
Within the rule-base, rules are selected for linkage using a fitness proportional
roulette wheel policy with slot size based on fitness. A number of possible alterna-
tive policies of course exist for selecting partners to join to, for example the ran-
dom scheme used in the abstract models above, but this appears to work well.
If the forward link of the first rule selected, or the back link of the second is al-
ready activated then that rule is already corporate and the corporation is scanned
for the appropriate end rule (i.e. the rule in that corporation with an inactive 'for-
ward link' or 'back link' respectively), and this becomes the selected rule. Fur-
thermore if the first rule is corporate, say belonging to corporation X, then the
second rule is selected from the set: [P] - [X], where P represents the population.
If this precaution is not taken then there is the risk of forming 'circular' corpora-
tions.
Based on the proposals of Wilson and Goldberg [17] corporate activity influ-
ences the discovery mechanisms but does not directly influence the activity of the
production system. For this reason it was decided to give each rule one further pa-
rameter, fitness. The individual fitness parameter is used as before by the produc-
tion system, but GA activity is now guided by rule fitnesses. Corporate fitness is
equal to the ave rage fitness of member rules. The rules' individual fitnesses, how-
ever, are left unaltered.
Having defined the nature of corporations and proposed a method for their for-
mation it is now necessary to determine what modifications must be made to the
discovery component.
Rule replacement, be it by the cover operator or the GA, like the roulette wheel
selection for reproduction, is based on the reciproc al of rule fitnesses. If a corpo-
rate rule is selected for deletion then the corporation is first disbanded, then the se-
lected individual is tagged for deletion. These are the only modifications required
by the covering operator, but the GA alterations require further attention.
The crossover site is selected as usual and a single offspring rule is created
from the two parent rules. This differs from the original ZCS (which produces two
children from crossover) but the rate of genetic input (rule replacement rate) is
consistent with ZCS as the GA rate is set to 0.25 (once every four time steps on
average).The new rule inherits 1/3 of the strength of each parent if crossover is
employed (or 1/2 of the parent's strength if it is not).
The offspring rule inherits 'equivalent' links to the 'link back' of the first par-
ent and the 'link forward' of the second parent. These links, however, will have to
be set not to refer to rules in the original corporations but to the equivalent rules in
the new corporation.
For example, corporation X consists of rules 1,2 and 3; corporation Y consists
of rules 4, 5, 6 and 7 (Fig. 6); and rules 2 and 5 are selected for reproduction. The
new offspring from crossing rules 2 and 5 is termed rule 8; however, rule 2 linked
back to rule 1 so the new corporation (Z) will also require a copy of rule 1 from
40 L. Bull, A. Tomlinson

corporation X, and likewise copies of rules 6 and 7 from corporation Y. The copy
of rule 1 is called rule 1', and those of rules 6 and 7 are called rules 6' and 7' re-
spectively. Corporation Z produced by this corporate crossover operation contains
the foHowing rules: [r1', r8, r6', r7']. In this way the offspring rule, rule 8, is linked
back to the facsimile of rule 1 (rule 1') and linked forward to the facsimile of rule
6 (rule 6').

Crossover point

Corp' X Parent I

Corp' Y 4 7 Parent 2

Corp' Z l' Offspring

Fig. 6. Corporate Crossover

Each additional rule that is reproduced by crossover donates half of its strength
to its offspring as above for reproduction without crossover. Mutation is applied
only to the new rule derived from crossover (i.e. rule 8 in the example).
The basic ZCS model was modified to act as a prototype corporate classifier
system (ZCCS). Modifications were implemented as described in the last section
and alI other system parameters were maintained as in Wilson's original experi-
ments.
ZCCS was tested in the same environment as Wilson's original ZCS experi-
ment, W oods 1. A record was kept of system performance for each trial and also
the mean number of corporations active during each tria!.

3.3 Woods 1

W oods 1 is a two-dimensional rectilinear grid of dimensions 5 x 5. 16 celIs are


blank, eight contain trees aud one contains food (Fig. 7). The classifier system is
viewed as an 'animat' [19] traversing this map in search of food. It is positioned
randomIy in one of the blank cells and can move into any one of the surrounding
eight cells on each time step, uniess they are occupied by trees. The environment
is toroidai so if the animat moves off one edge it appears on the opposite edge of
the map. If the animat moves into a 'food ceH' then the system receives a reward
from the environment in the form of credit, and the animat is relocated as before.
Symbiogenesis as a Machine Leaming Mechanism 41

* F -Food
0- Tree
O O F
* -Animat
O O O
O O O

Fig. 7. Woods I

On each time step the animat receives a message from the environment which
describes the surrounding eight ceHs. The message is encoded as a 16-bit binary
string with two bits representing each of the eight ceHs. A blank ceH is represented
by 00, food (F) by 11 and trees (O) by 10 (01 has no meaning). The message is or-
dered with the ceH direct1y above the animat represented by the first bit-pair, and
then proceeding clockwise around the animat.
The trial is repeated 10,000 times and a record is kept of a moving average
(over the previous 50 trials) of how many steps it takes for the animat to move into
a food ceH on each tria\. If the animat moved randomly then its performance
would balance out to about 27 steps per tria\. Optimum performance in Woods 1 is
1.7 steps.
In this initial test there is no discemible difference between the performance of
ZCCS and ZCS (Fig. 8 ). The number of corporations in ZCCS rose from O to 40
in 100 trials and then c\imbed slowly to 80 by the end of the run.
Hence this experiment has demonstrated that it is possible to implement a cor-
porate classifier system as proposed by Wilson and Goldberg. The corporate clas-
sifier system used for the experiment can be considered merely a template design
kept as minimal as possible using the simplistic symbiogenesis process explored
in Sec.2. However, as noted in Sec. 1, there are many ways in which system de-
sign could be expanded or modified to achieve more directed gains drawing
closely on the natural phenomenon. Some of these are now considered.

3.4 Symbiont Encapsulation

Maynard-Smith and Szathmary [1] note that during the emergence of 'proto-ceHs'
genetic linkage between 'naked' replicators is not sufficient for more complex
evolutionary structures to emerge. That is, a linked set of cooperating entities at a
given physical location are still open to exploitation from neighbours which are
not linked to them and hence do not necessarily share in their evolutionary future.
Hence, whilst pas si ve localisation can initiate cooperative relationships, only
through the active formation of a protective membrane can a symbiogenetic entity
perpetuate itself effectively. We now consider this within our corporate classifier
system by first including an analogue to spatial location and then a membrane to
exclude cheats.
42 L. Bull, A. Tomlinson

10 r------------,

8
~ zCS
cr-t'JZCCS

O L-----------~
O 2000 4000 6000 8000 10000
Frial.
Fig. 8. ZCCS v ZCS in Woods I

Whilst the position of the rules within the rule-base of an LCS is not significant
to their use, they do have logical relationships with each other and so corporations
are now encouraged to encapsulate chains of inference. Corporate links here take
on a temporal connotation and imply that the rules within a corporation are placed
so as to fire in succession and thus to map a proposed plan of action during the so-
lution of a multiple time step problem.
This is achieved by making linkage a niche operation, or more precisely a
cross-niche operation. Coupling occurs between subsequent match sets. This
means that on time step t there is a possibility that a rule that matches the current
message from the environment may link to a rule which matched the stimulus at
time tol . This encourages the structuring of meaningful sequences of rules.
To be selected for coupling, a rule must be in the current match set (termed [M])
and its appropriate link must be inactive. Coupling occurs over two time-steps. On
the first, a mie in [M] is selected probablistically (roulette wheel, based on
strength) from those with an inactive 'link-forward' , on the second, a rule in the
new match set is selected again probablistically from those with an inactive 'link-
back'. Rules already in a corporation are not allowed to join to rules within their
own corporation. In ali environments used during testing, the system was reset af-
ter receipt of a reward from the environment on some time step. Corporate Iinks
are not allowed to form between this reward time step and the first one of the fol-
lowing trial as this 'move' does not represent any form of causal transition under
the control of the system.
To further maintain temporal integrity amongst corporate rule-strings the GA is
adjusted to operate within match sets. The idea of a niche GA operating in the
match set was suggested by Booker [20] to introduce mating restrictions and thus
to assist the GA in producing more meaningful offspring as like breeds with like.
In CCS the adjustment is made so that if corporations are selected for crossover
then the resultant corporation should still represent a meaningful series of re-
sponses to experienced stimuli.
Symbiogenesis as a Machine Leaming Mechanism 43

Preliminary testing of ZCS with a niche GA indicated that GA activity became


focused on the more frequently visited states, with the result that niche occupancy
for such states with high mean payoff values became excessively large and states
with low payoff (especially infrequently visited ones) generally had lower, possi-
bly inadequate niche occupancy, due to the combination of a niche GA and the
ZCS replacement policy (based on the reciprocal of rule fitness).
A simple, if somewhat ad hoc, solution is to replace rules from within the same
niche that the GA was operating in, on the provision that there are already a mini-
mum number of rules representing the niche. It was decided that for all tests this
minimum number of rules would be 20. For a population of 400 rules, at least 20
niches can be maintained at this level of occupancy even if all rules are 100% spe-
cific. This setting has been found to be adequate for alI environments used for test-
mg.
Early testing of the system with these modifications showed that, because of the
dissipation of rewards and pay-offs amongst action sets due to the common bucket
of ZCS, although useful corporations did form they never fulIy established them-
selves within the system and exhibited lower fitness than their peers within the re-
spective match sets. Consequently their presence made little difference to activi-
ties of the performance component and their chances of reproduction were poor
(results not shown). In Woods 1 a marginal improvement in performance com-
pared to ZCCS could be observed, possibly due to the introduction of a niche GA.
Therefore the performance component was adjusted to respond to the presence
of corporations. Action selection in the production system is determined stochasti-
cally, according to the relative strengths of the rules within the current match set.
A roulette wheel policy is employed which selects a rule whose action becomes
the system's action. Now, if this rule is corporate and its link forward is active then
it is tagged as being in control of the system. an the subsequent time-step, if the
subsequent rule in the corporation is a member of the new match set then it auto-
matically receives control of the system and forms an action set of size one. In this
way the corporation keeps control of the performance component and is solely re-
sponsible for system decisions until either a reward is received from the environ-
ment or, on some step, the next rule in the corporation chain does not match the
current stimulus. When either of these events occurs the performance component
returns to normal operation. Further, 'interna!' corporate rules (i.e. alI but the first
rule in the corporation) are flagged as internal and only respond to stimuli during
periods when the corporation to which they belong has control of the system. This
modification is made to further encapsulate corporate rule structures and thus to
reinforce the inter-corporate rule co-dependencies, an analogue for membrane
formation described above.
This mechanism, referred to as 'persistence', allows corporations to directly
prove their true worth without being interrupted and without the final reward be-
ing dissipated amongst parasitic rules that tend to accumulate in action sets close
to rewards. A corporation that indicates a useful series of actions will soon achieve
a fitness value that reflects its capabilities.
The final modification to the performance component consists of not charging
tax on time steps when a corporation holds control. In ZCS tax is applied in order
to encourage exploitation of the stronger classifiers, and therefore to increase pres-
44 L. Bull, A. Tomlinson

sure against weaker classifiers. If, due to persistence, perforrnance component


control is held by a corporation on some time-step then the usual activities of this
component have been suspended, i.e. action selection on that time step is not
based on free competition between competing 'hypotheses' in [M), the appropriate
corporate rule is automatically selected to make this decision. In this situation, be-
longing to [M) and not to [A) is not necessarily an indication of low utility, and so
it is less appropriate to charge tax to these rules on such a time step. The system
was adjusted to include these modifications.

3.5 System Evaluation in Markov and non-Markov Environments

The modified CCS model was initially tested in the previously used environment,
Woods 1. Fig. 9 show graphs of the average steps taken to reach food over ten
runs. The system performed well, reaching an average of about 2.2 steps to food
over 10,000 runs. The optimum perforrnance in Woods 1 is 1.7, and ZCS achieved
an average of about 3 steps to food. In these tests the ZCS GA was modified to
operate in the match set and the rule replacement rate was increased to one rule
per time step on average, to facilitate a fair comparison. The modified ZCS
achieved a rate of 2.6 steps to food; in this simple Markov environment CCS can
be seen to provide minimal benefits.
An alternative test environment is now presented which allows for a clearer dif-
ferentiation between competing systems' capabilities. The new test is a simple
variable multi-step environment. On each of N time steps the system is presented
with a stimulus and must select one of A actions, where A is a variable integer
value which defines the breadth of a maze. N is the number of states or nodes to a
reward and thus defines the maze depth. After N steps, the system receives a re-
ward from the environment and a new task then begins. The size of the reward de-
pends on which route the system chooses and so over time the system leams the
optimum reward yielding route through the maze. There is, however, more than
one maze. There can be up to Mz different mazes. The system is informed which
particular maze it is being presented with only on the first time step of each trial.
On all subsequent steps the stimulus is representative only of the current time step
in the trial. Hence these tasks falI into the non-Markov category. The maze is se-
lected randomly at the start of each trial.
Symbiogenesis as a Machine Learning Mechanism 45

10 r-------------------------------,

8 --z S
G---E) S

6
"O
o
<2
8
'"c.. 4
2:l
(/J

Tri al
Fig. 9. CCS performance in Woods l

Fig. 10 illustrates a simple task of this type with A set to 2, N set to 2 and Mz
set to 2. The environmental stimulus at each time step is also included. In this ex-
ample, a reward of 1,000 is awarded for one route on each map. AII other routes
receive a reward of O.
In the example in Fig. 10 the message length L is set to 3. With L set to 3 there
are eight possible stimuli and so for a two-map problem the maximum depth will
be 7, as the first time step (ts,,) takes two stimuli. Similarly with L set to 3 a four-
maze problem may have a maximum depth of 5, etc.
Clearly ZCS will be unable to master more than a single map due to the sensory
ambiguity after the first time step, but ees should be able to tackle multiple maze
trials. In the task depicted in Fig. 10 some form of linkage is necessary for the sys-
tem to be able to determine the appropriate move on the second timestep (ts,). In
maze one the correct action is O and in maze two it is 1 (the stimulus, however, is
the same, i.e. 001).
ees should be able to overcome the ambiguities present in the delayed reward
environments. Also presented are plots of zes performance for comparison. Gen-
eral system parameters are the same as for the tests in the Woods environment for
ali systems.
For the purposes of these tests the mazes are kept relatively simple. A is set to 2
for ali tests so on each time step the system merely has to make al-bit binary de-
cision. L is set to 3 and the systems are tested on tasks consisting of two and four
mazes of depths of 2 and 3 (figs. Il, 12 and 13). AlI parameters are set as in pre-
vious tests with a coupling rate of 0.25 and a base GA activation rate of also of
46 L. Bull, A. Tomlinson

0.25 for CCS. These graphs show the average scores of ten mns for each system
over 10,000 trials, again with a SO-point moving average filter applied to smooth
the curves. In figs. 8-10, task 4:3 for example represents a task consisting of four
mazes of depth three.

Maze 1 rcward Maze2

I Environmenlal Messagcs
000 001 I II 00 I

Fig. 10. Simple Delayed Reward Environment - Taks 2:2

Immediately it can be seen that ZCS is not really equipped to tackle these prob-
lems. If Mz is set to 1 then ZCS can operate and is able to leam a single maze but
with four mazes ZCS will be intentionally correct on average about one in four
times at best (i.e. when the map it has leamt is presented). It can be seen that as N
is increased the system becomes increasingIy unable to locate the reward. At the
end of a IO,OOO-trial run on task 2:2 (see Fig. 10) the CCS ruIe-base was examined
and below are presented two typically observed corporations.

IIXK) r----------------,
xex)

21KI

Fig. 11. Performance in Task 2:2


Symbiogenesis as a Machine Leaming Mechanism 47

JlXMI . . . - - -- - - - - - -- - - - - - ,

"o
u
ti)

2(XI

Fig. 12. Performance in Task 4:2

1000 , . . . . - - - - - - - - - - - - - - - ,
--z S
~ CCS
00

~ 600
o
u
ti) 400

200

-1000 6000 8000 10000


Trials

Fig. 13. Performance in Task 4:3

Corporation 3846 responds to maze 1 (Fig. 10). At time O rule 349 responds to
the presented stimulus (000) and proposes aetion O. If rule 349 wins the auction
then at time 1 rule 134 will, as it matehes the new stimulus (001), automatieally be
in control of the produetion system. Rule 134 matehes the stimulus and then pro-
poses the correct aetion for maze L at time 1. This is a strong eorporation and
many eopies of it ean be seen in the rule-base.
Corporation 3931 responds to maze 2 (Fig. 10). Rule 202 matehes the stimulus
at time O and proposes the eorreet aetion. On the subsequent time step rule 328
matehes the new stimulus and again proposes the eorreet aetion. This is also a
strong eorporation, eopies of whieh ean be seen in the rule-base. In CCS these two
eorporations alone are suffieient to solve task 2:2 where as ZCS is unable to taekle
the ambiguity present on time step 1.
48 L. Bull, A. Tomlinson

Table 1.

I.D. Corp' I.D. Condition Action Link <- Link->

349 3846 000 O 134


134 3846 0#1 O 349

I.D. Corp' I.D. Condition Action Link<- Link->


202 3931 #1# 328
328 3931 0#1 202

The average number of corporations formed in CCS after 10,000 trials in task
2:2 is just under 200 and these are alI of size 2. Virtually every rule in the popula-
tion (size 400) has linked to a partner to form a two-member corporation. When N
is increased to 3 corporations of size 2 form initialIy (about 80 by trial 300) and
corporations of size 3 form more slowly (50 by trial 300). By trial 2000 there are
only two or three corporations of size 2, but just under 130 of size 3. Most rules in
the population (about 390 of 400 rules) therefore now belong to three-member
corporations.
CCS offers a much improved performance on alI tasks, but the results indicate
that the discovery components of both systems found their performance impeded
as N increased.
The above results show that adding the process of symbiogenesis to the simple
ZCS can give improvements in performance in complex environments which con-
tain ambiguous sensory inputs.

4 Conclusion

Symbiogenesis appears to have been the major factor in two important steps of
cell evolution - the evolution of chromosomes and eukaryotes. In this paper we
have used a tuneable abstract model of multi-species evolution to examine the
conditions under which the main phase of symbiogenesis, hereditary endosymbio-
ses, can occur.
We viewed the formation of an hereditary endosymbiosis between two separate
species as a megamutation, that is a macro-Ievel evolutionary phenomenon. It was
seen, perhaps not unsurprisingly, that, as the amount of interdependence between
the two species increased, the chances of the endosymbiosis becoming a more ad-
vantagous association also increased. We found that, for any amount of intra-
genome epistasis, roughly an equivalent amount of inter-species epistasis was re-
quired for the endosymbiosis to equally share a fixed population space, with it
dominating thereafter.
This general process was then added to a machine leaming architecture and
shown to improve performance in more complex environments containing sensory
Symbiogenesis as a Machine Leaming Mechanism 49

ambiguities. It is interesting to note that analogous mechanisms to those sug-


gested during the symbiotic emergence of proto-cells were required to reali se such
behaviour.
This work is being explored under EPSRC grant no. GRJR06748 in the applica-
tion of LCS to distributed road traffic junction control. Here the aim is to use the
symbiogenesis process to capture the temporal aspects of the task and produce ef-
fective unsupervised leaming controllers which are examinable by engineers.

References

1. Maynard-Smith, J. & Szathmary, E. (1995), The Major Transitions in Evolution, W.H.


Freeman, New York.
2. Khakhina, L. N. (ed.) (1992), Concepts of Symbiogenesis: History of Symbiogenesis as
an Evolutionary Mechanism, Yale University Press, Yale.
3. Maynard-Smith,1. & Szathmary, E. (1993) The Origin of Chromosomes 1: Selection
for Linkage. Journal of Theoretical Biology 164:437-466
4. Kauffman, S.A. & Johnsen, S. (1989) Co-evolution to the Edge of Chaos: Coupled
Fitness landscapes, Poise States and Co-evolutionary Avalanches. In C.G. Langton, C.
Taylor, J.D. Farmer & S.Rasmussen (Eds) Artificial Life Il. Addison-Wesley, New
York, pp.325-370
5. Haynes, R. H. (1991), "Modes of Mutation and Repair in Evolutionary Rhythms", in
L. Margulis & R. Fester (eds.) Symbiosis as a Source of Evolutionary Innovation,
MIT Press, Massachusetts, pp.40-56.
6. Packard, N. (1988), "Adaption to the Edge of Chaos", in S Kelso & M Schlesinger
(eds.) Complexity in Biologic Modelling.
7. Holland, J. H. (1975), Adaption in Natural and Artificial Systems, Univ. of Michigan
Press, Ann Arbor.
8. Kauffman, S. A. (1993) The Origins ofOrder. Oxford University Press, Oxford.
9. Allee, W. C., Emerson A E, Schmidt K P, Park T & Park O (eds.) (1949), Principles of
Animal Ecology, Saunders Company, London.
10. Smith, D. C. & Douglas, A. E. (eds.) (1987), The Biology of Symbiosis, Edward Ar-
nold, London.
11. Burns, T. P. (1993), "Discussion: Mutualism as Pattern and Process in Ecosystem Or-
ganisation", in H. Kawanabe, J. E. Cohen & K. Iwaski (eds.) Mutualism and Commu-
nity Organisation, Oxford University Press, Oxford, pp.239-251.
12. Ikegami, T. & Kaneko, K. (1990) "Genetic Fusion." Physical Review Letters, 65 (26)
:3352-3355.
13. Goldberg, D., Kargupta, H. & Harik, G. (1993) Rapid, Accurate Optimisation of Diffi-
cult Problems using Fast Messy Genetic Aigorithms. In S. Forrest (Ed.) Proceedings
of the Fifth International Conference on Genetic Algorithms. Morgan Kaufmann,
pp.56-64
14. Wilson, S. W. (1994) "ZCS: A zeroth level classifier system." Evolutionary Computa-
tion, 2 (1): 1-18.
15. Wilson, S. W. & Goldberg, D. E. (1989) "A critical review of classifier systems." In
Schaffer, J. D. (Ed.) Proceedings of the Third International Conference on Genetic
Algorithms, (pp.244-255), Morgan Kaufmann.
50 L. Bull, A. Tomlinson

16. Rolland, J.R. (1995) Hidden Order. Addison-Wesley, New York.


17. Wilson, S. W. (1985) "Knowledge growth in an artificial animal." In Grefenstette, 1. 1.
(Ed). Proceedings of an International Conference on Genetic Algorithms and their
Applications (pp.16-23), Lawrence Erlbaum Associates.
18. Booker, L.B. (1985) "Improving the performance of Genetic Algorithms in Classifier
Systems." in Grefenstette, 1. J. (Ed.) Proceedings ofthe First International Conference
on Genetic Algorithms and their Applications. (pp. 80-92). Lawrence Erlbaum Assoc.
An Overview of Artificial Immune Systems

1. Timmis, T. Knight
Computing Laboratory, University of Kent, Canterbury, UK
1.Timmis@ukc.ac.uk

L. N. de Castro
School of Electrical Engineering and Computing,
State University of Campinas, BraziI

E. Rart
School of Computing, Napier University, Edinburgh, Scotland, UK

Abstract. The immune system is highly distributed, highly adaptive, self-


organising in nature, maintains a memory of past encounters and has the ability to
continually leam about new encounters. From a computational point of view, the
immune system has much to offer by way of inspiration to computer scientists and
engineers alike. As computational problems become more complex, increasingly,
people are seeking out novel approaches to these problems, often turning to nature
for inspiration. A great deal of attention is now being paid to the vertebrate im-
mune system as a potential source of inspiration, where it is thought that different
insights and alternative solutions can be gleaned, over and above other biologi-
cally inspired methods.
Given this rise in attention to the immune system, it seems appropriate to ex-
plore this area in some detail. This survey explores the salient features of the im-
mune system that are inspiring computer scientists and engineers to build Artifi-
cial Immune Systems. An extensive survey of applications is presented, ranging
from network security to optimisation and machine leaming. Rowever, this is not
complete, as no survey ever is, but it is hoped this will go some way to illustrate
the potential of this exciting and novel area of research.

1 Introduction

This contribution examines the growing field of artificial immune systems (AIS).
AIS can be defined as computational systems inspired by theoretical immunology
and observed immune functions, principles and models, which are applied to prob-
lem solving [1]. The field of AIS is relativity new and draws upon work done by
many theoretical immunologists (e.g., Jerne [2], Perelson [3], and Bersini and
Varela [4] to name a few). What is of interest to researchers developing AIS is not
the modelling of the immune system, but extracting or gleaning of useful mecha-
nisms that can be used as metaphors or inspiration to help in the development of
(computational) tools for solving particular problems. Within biologically inspired
computing, it is quite common to see gross simplifications of the biologic al sys-
52 J.Timmis et al.

tems, on which the artificial systems are based: AIS is no exception. However, it
should be remembered that, although a good understanding of the biologic al sys-
tem is essential in this domain, it is inspiration from nature that is sought, rather
than the creation of accurate models.
Through reading the literature, it can be observed that AIS have been applied to
a wide range of application domains. Some of the first work in applying immune
system metaphors was undertaken in the area of fault diagnosis [5]. Later work
applied immune system metaphors to the field of computer security and virus de-
tection [6], which seemed to act as a catalyst for further investigation of the im-
mune system as a metaphor in many areas. However, as yet, there seems to be no
niche area for AIS. Some people have commented that this may be a weakness, or
a gap in the field [7] and that there needs to be a serious undertaking to find such a
niche area and this will in turn go to strengthen the area. It has also be argued that
AIS are incredibly flexible, as are many biologically inspired techniques, suitable
for a number of applications and can be thought of as a novel soft computing
paradigm, suitable for integration with many more traditional techniques [8].
The growing interest in AIS is reflected in the growing number of special ses-
sions and invited tracks at a number of well-established international conferences,
such as the IEEE SMC and GECCO conferences. The frrst international confer-
ence on artificial immune systems (ICARIS) took place at the University of Kent
at Canterbury (UKC) in September 2002. Its great success in terms of organisation
and quality of papers presented motivated the second ICARIS to be held in Edin-
burgh in September 2003.
This chapter is organised in the following manner. First, reasons are given why
the immune system has generated such interest within the computing and engi-
neering community. This is fOllowed by a simple review of relevant immunology
that has served as a foundation for much of the work reviewed in this contribution.
Immunology is a vast topic and no effort has been made to cover the whole area:
suitable citations are provided in the text to further direct the reader. The area of
AIS is then presented, in terms of a general framework proposed in [1]. A review
of AIS applications is then presented, however, providing a general overview of a
number of different application areas. Final1y, comments on the perceived future
of this emerging technology are then presented.

2 The Immune System: Metaphorically Speaking

When considered from a computational point of view, the immune system can be
considered to be a rich source of inspiration as it displays leaming, adaptability, is
self-organising, highly distributed and displays a memory. There are many reasons
why the immune system is ofinterest to computing [1,9]; these can be summarised
as fOllOWS:
• Recognition: The immune system has the ability to recognise, identify and re-
spond to a vast number of different patterns. Additionally, the immune system
can differentiate between malfunctioning self cells and harmful non-self cells,
therefore maintaining some sense of self.
An Overview of Artificial Immune Systems 53

• Feature extraction: Through the use of Antigen Presenting Cells (APC) the
immune system has the ability to extract features of the antigen by filtering mo-
lecular noise from disease causing agents called an antigen, before being pre-
sented to other immune cells, including the lymphocytes.
• Diversity: There are two major processes involved in the generation and main-
tenance of diversity in the immune system. The frrst is the generation of recep-
tor molecules through the recombination of gene segments from gene libraries.
By recombining genes from a finite set, the immune system is capable of gen-
erating an almost infinite number of varying types of receptors, thus endowing
the immune system with a large coverage of the universe of antigens. The sec-
ond process, whieh assists with diversity in the immune system, is known as
somatic hypermutation. Immune cells reproduce themselves in response to in-
vading antigens. During reproduction, they are subjected to a somatie mutation
process with high rates that allow the creation of novel patterns of receptors
molecules, thus increasing the diversity of the immune receptors [10].
• Leaming: The mechanism of somatic hypermutation followed by a strong se-
lective pressure also allows the immune system to fine-tune its response to an
invading pathogen; a process termed affinity maturation [11]. Mfinity matura-
tion guarantees that the immune system becomes increasingly better at the task
of recognising patterns. The immune network theory is another powerful exam-
ple of leaming in the immune system. It suggests that the immune system has a
dynamic set of mutually recognising cells and molecules, and the presence of
an invading antigen causes a perturbation in this network. As a result, the dy-
narnic immune network, whieh presents an intrinsic steady state in the absence
of antigens, has to self-organise its pattern of behaviour again, so as to accom-
modate the disturbance [7]. Therefore, invading antigens require the immune
network to adapt itself to this new element.
• Memory: After an immune response to a given antigen, some sets of cells and
molecules are endowed with increased life spans in order to provide faster and
more powerful immune responses to future infections by the same or similar
antigens. This process, known as the maturation of the immune response, al-
lows the maintenance of those cells and molecules successful at recognising an-
tigens. This is the major principle behind vaccination procedures in medicine
and immunotherapy. A weakened or dead sample of an antigen (e.g., a virus) is
inoculated into an individual so as to promote an immune response (with no
disease symptoms) in order to generate memory cells and molecules to that an-
tigen.
• Distributed detection: There is inherent distribution within the immune system.
There is no one point of overall control; each immune ceH is specifically stimu-
lated and responds to new antigens that can invade the organism in any loca-
tion.
• Self-regulation: Immune systems dynamics are such that the immune system
population is controlled by local interactions and not by a central point of con-
trol. After a disease has been successfuHy combated by the immune system, it
returns to its normal steady state, until it is needed in response to another anti-
gen. The immune network theory explicitly accounts for this type of self-
regulatory mechanism.
54 J.Tirnmis et al.

• Metadynamics: The immune system is constantly creating new cells and mole-
cules, and eliminating those that are too old or are not being of great use.
Metadynamics is the name given to this continuous production, recruitment and
death of immune cells and molecules [12] .
• lmmune Network: In 1974 Jerne[2] proposed the immune network theory as an
alternative to explain how the immune system works. Re suggested that the
immune system is a dynamic system whose cells and molecules are capable of
recognising each other, thus forming an internal network of communication
within the organism. This network provides the basis for immunological mem-
ory to be achieved, via a self-supporting and self-organising network.
The remainder of this chapter outlines some of the salient features of the im-
mune system that have been employed in the development of AIS. Attention is
then drawn to significant applications of the immune system as a metaphor for
computational systems.

3 The Vertebrate Immune System

The vertebrate immune system is composed of diverse sets of cells and molecules
that work in collaboration with other bodily systems in order to maintain a steady
state within the host. A role of the immune system is to protect our bodies from in-
fectious agents such as viruses, bacteria, fungi and other parasites. On the surface
of these agents are antigens that allow the identification of the invading agents
(pathogens) by the immune cells and molecules, thus provoking an immune re-
sponse. There are two basic types of immunity, innate and adaptive. Innate immu-
nity [13] is not directed towards specific invaders into the body, but against any
pathogens that enter the body. The innate immune system plays a vital role in the
initiation and regulation of immune responses, including adaptive immune re-
sponses. Specialised cells of the innate immune system evolved so as to recognise
and bind to common molecular patterns found only in microorganisms, but the in-
nate immune system is by no means a complete solution to protecting the body.
Adaptive or acquired immunity [14], however, allows the immune system to
launch an attack against any invader that the innate system cannot remove. The
adaptive system is directed against specific invaders, and is modified by exposure
to such invaders. The adaptive immune system mainly consists of lymphocytes,
which are white blood cells, more specifically B-cells and T-cells. These cells aid
in the process of recognising and destroying specific substances. Any substance
that is capable of generating such a response from the lymphocytes is called an an-
tigen or immunogen. Antigens are not the invading microorganisms themselves;
they are substances such as toxins or enzymes in the microorganisms that the im-
mune system considers foreign. Adaptive immune responses are normally directed
against the antigen that provoked them and are said to be antigen specific. The
immune system generalizes by virtue of the presence of the same antigens in more
than one infectious agent. Many immunisations exploit this by presenting the im-
mune system with an innocuous organism, which carries antigens present in more
An Overview of Artificial Immune Systems 55

dangerous organisms. Thus the immune system leams to react to a particular pat-
tern of antigen.
The immune system is said to be adaptive, in that when an adaptive immune re-
sponse is elicited B-cells undergo cloning in an attempt to produce sufficient anti-
bodies to remove the infectious agent [2,15]. When cloning, B-cells undergo a
stochastic process of somatic hypermutation [10] where an attempt is made by the
immune system to generate a wider antibody repertoire so as to be able to remove
the infectious agent from the body and prepare the body for infection from a simi-
lar but different infection at some point in the future.
After the primary immune response, when the immune system first encounters
a foreign substance and the substance has been removed from the system, a certain
quantity of B-cells remain in the immune system and acts as an immunological
memory [2,16]. This is to allow for the immune system to launch a faster and
stronger attack against the infecting agent, called the secondary immune response.

3.1 Primary and Secondary Immune Responses

A primary response [17] is provoked when the immune system encounters an an-
tigen for the first time. A number of antibodies will be produced by the immune
system in response to the infection, which will help to eliminate the antigen from
the body. However, after a period of days the levels of antibody begin to degrade,
unti! the time when the antigen is encountered again. This secondary immune re-
sponse is said to be specific to the antigen that first initiated the immune response
and causes a very rapid growth in the quantity of B-cells and antibodies. This sec-
ond, faster response is attributed to memory cells remaining in the immune system,
so that when the antigen, or similar antigen, is encountered, a new immunity does
not need to be built up, it is already there. This means that the body is ready to
combat any re-infection Fig. 1 illustrates this process.
The amount of antibody is increased by the immune system generating a mas-
si ve number of B-cells through a process called clonal selection [16], which is
now discussed in relation to the B-cell in the immune system.

3.2 B-cells and Antibodies

The B-cell is an integral part of the immune system. Through a process of recogni-
tion and stimulation, B-cells will clone and mutate to produce a diverse set of an-
tibodies in an attempt to remove the infection from the body [18]. The antibodies
are specific proteins that recognise and bind to another protein. The production
and binding of antibodies is usually a way of signalling other cells to kill, ingest or
remove the bound substance [19]. Each antibody has two paratopes and two epi-
topes that are the specialised parts of the antibody that identify other molecules
[20]. Binding between antigens and antibodies is governed by how well the para-
topes on the antibody matches the epitope of the antigen; the closer this match, the
stronger the bind. Although it is the antibodies that surround the B-cell, which are
responsible for recognising and attaching to antigen invaders, it is the B-cell itself
that has one of the most important roles in the immune system.
56 J.Timmis et al.

Primary Response Secondary Response Cross-Reactive


r-------------~I ~I__~R~es~pon~se~~

Responseto
Lag Ag"

Time

Fig. 1. Primary and secondary immune response. Agl infects the system and a lag occurs
before a primary immune response is initiated. The host is then re-infected with Ag 1 and a
different antigen Ag,. A fast secondary response is elicited against Ag 1, whilst a primary re-
sponse is initiated against Ag,. At some point in the future, the host is then infected with
Agl" which is a slight variation on Ag 1• Due to the generalist capability of the immune sys-
tem, a secondary response is elicited against the antigen. (Redrawn from [1])

This is not the full story, as B-cells are also affected by helper T-celIs during
the immune response [21]. T-cell paratopes are different from those on B-cells in
that they recognise fragments of antigens that have been combined with molecules
found on the surfaces of the other cells. These molecules are called MHC mole-
cules (Major Histocompatibility Complex). As T-cells circulate through the body
they scan the surfaces of body cells for the presence of foreign antigens that have
been picked up by the MHC molecules. This function is sometimes called immune
surveillance. These helper T-cells when bound to an antigen secrete interleukines
that act on B-cells helping to stimulate them.

3.3 Immune Memory

It is possible to identify two main philosophical avenues that try to explain how
immune memory is acquired and maintained [22-35]: (1) clonal expansion and se-
lection and (2) immune network.
Throughout the lifetime of an individual, it is expected to encounter a given an-
tigen repeatedly. The initial exposure to an antigen that stimulates an adaptive
immune response is handled by a spectrum of small clones of B-cells, each pro-
ducing antibodies of different affinity. The effectiveness of the immune response
to secondary encounters is considerably enhanced by storing some high affinity
antibody producing cells from the first infection, named memory cells, so as to
form a large initial clone for subsequent encounters. Thus memory, in the context
of secondary immune responses, is a clonal property [26].
Another theory that has been used in AIS for inspiration is the theory first pro-
posed by Jeme[2] and reviewed by Perelson [3] called the immune network the-
An Overview of Artificial Immune Systems 57

ory. This theory states that B-cells co-stimulate each other via portions of their re-
ceptor molecules (idiotopes) in such a way as to mimic antigens. An idiotope is
made up of amino acids within the variable region of an antibody or T -cell. A
network of B-cells is thus formed and highly stimulated B-cells survive and less
stimulated B-cells are removed from the system. It is further proposed that this
network yields useful topological information about the relationship between anti-
gens. For these reasons, this section focuses on this theory.

3.3.1 Immunological Memory via the Immune Network


Work in Jeme [2] proposed that the immune system is capable of achieving im-
munological memory by the existence of a mutually reinforcing network of B-
cells. These cells not only stimulate each other but also suppress connected B-
cells, though to a lesser degree. This suppression function is a mechanism by
which to regulate the over stimulation of B-cells in order to maintain a stable
memory.
This network of B-cells occurs due to the ability of paratopes, located on B-
cells, to match against idiotopes on other B-cells. The binding between idiotopes
and paratopes has the effect of stimulating the B-cells. This is because the para-
topes on B-cells react to the idiotopes on similar B-cells, as they would an antigen.
However, to counter the reaction there is a certain arnount of suppression between
B-cells to act as a regulatory mechanism Fig. 2 shows the basic principles of the
immune network theory. Here B-cell 1 stimulates three other cells, B-cells 2, 3
and 4, and also receives a certain arnount of suppression from each one. This cre-
ates a network type structure that provides a regulatory effect on neighbouring B-
cells. The immune network acts as a self-organising and self-regulatory system
that captures antigen information ready to launch an attack against any similar an-
tigens.

3.3.1.1 A proposed immune network model


Attempts have been made at creating immune network models [27,28] so as to
better understand its complex interactions. Work in Farmer et al. [27] proposed a
model to capture the essential characteristics of the immune network as described
by Jeme [2] and identify memory mechanisms in it, whereas Cameiro and Stewart
[28] observed how the immune system identifies self and non-self. Both Farmer et
al. [27] and Perelson [3] investigated Jemes' work in more depth and provided in-
sights into some of the mechanisms involved in the production and dynamics of
the immune network. There are a number of immune network models, but it is im-
possible to review them alI. This section will summarise the salient features of the
Farmer et al. model, as a form of case study to illustrate the potential power of
such a model for computation.
58 J.Timmis et al.

---------~ supreSSK>n
st iln \.11 ntion

Fig. 2. Jemes' idiotypic network hypothesis

Farmer el al. [27] created a simplistic model to simulate the immune system.
The model ignored the effect of T-cells and of macrophages in an attempt to cap-
ture the essential characteristics of the immune network. Central to their work was
the ca\culation of the dynamics of B-cell population related to a B-cell's stimula-
tion level. The authors proposed a simple equation that they consider takes into
account the three main contributing factors to B-cell stimulation level, these are:
(i) the contribution of the antigen bind ing (ii) the contribution of neighbouring B-
cells and (iii) the suppression of neighbouring B-cells. The rate of change of anti-
body concentration is given by [27:]

dx
- ' =C
["m
N . xx . -k,"mxx
N M
. + "m . xy .
]
dt L...) ., ' ) L... ' ·f ') L... ).' , ) (1)
}=, }= 1 } =1

where the first term represents the stimulation of the paratope of an antibody type i
by the epitope of an antibody j . The second term represents the suppression of an
antibody of type i when its epitope is recognised by the paratope of type j . The pa-
rameter c is a rate constant that depends on the number of collisions per unit time
and the rate of antibody production stimulated by a colIision. Constant k, repre-
sents a possible inequality between stimulation and suppression.
The stimulation of B-cell c10ning and mutation were included in the model to
create a diverse set of B-cells. The amount by which any one B-cell c10ned was in
relation to how stimulated the B-cell was. The more stimulated a B-cell, the more
c10nes it produced. Three mutation mechanisms were introduced on the strings:
crossover, inversion and point mutation. Crossover is the interchanging of two
points on two different strings, inversion is the simple inverting of the value of the
bit in a string, a O to a 1 and vice versa and point mutation is the random changing
of a bit in a given string.
An Overview of Artificial Immune Systems 59

3.4 Repertoire and Shape Space

Coutinho [29] first postulated the idea of repertoire completeness. He stated that,
if the immune systems antibody repertoire is complete, that is, presents receptor
molecules capable of recognizing any molecular shape, then antibodies with im-
munogenic idiotopes can be recognised by other antibodies, and therefore an idio-
typic network would be created.
However, in order to understand completeness, it is first necessary to under-
stand the concept of shape space. Shape space has been an important mechanism
to create and represent abstract models of immune cells and molecules [1]. The
basic idea is that aII the features of a receptor molecule necessary to characterise
its binding region with an antigen are called its generali sed shape. The generali sed
shape of any receptor molecule can be represented by an attribute string of a given
length L in a generic L-dimensional space, called shape space.
To illustrate this idea, consider abi-dimensional space as illustrated in Fig. 3.
The set of aH possible shapes lie within a finite volume V in this bi-dimensional
shape space. The antibodies are represented by the letter A (black dots) and the an-
tigens are depicted by the x. Each antibody (A) can recognise a given number of
antigens within an affinity threshold f: and therefore can recognise a volume (V)
of antigens (x) in shape space. Therefore, a finite set of antibodies appropriately
placed in the shape space and with appropriate affinity thresholds are sufficient to
cover the whole shape space; thus being capable of recognisng any molecular
shape that can be presented to the immune system.

3.5 Learning within the Immune Network

It has been proposed that the immune network can be thought of as being cogni-
tive [12] and exhibits leaming capabilities. The authors proposed four reasons as
to why they consider immune systems to be cognitive: (i) they can recognise mo-
lecular shapes; (ii) they remember history of encounters; (iii) they define the
boundaries of self, and (iv) they can make inferences about antigenic pattems they
have yet to encounter. Taking these points, the paper explores cognitive mecha-
nisms of the immune system and proposes that the immune network can be
thought of as a cognitive network, in a similar way to a neural network.
60 lTimmis et al.

Fig. 3. A diagrammatic representation of shape space. Adapted from [3]

The work suggests that the immune network is capable of producing dynamic
patterns of activity over the entire network and that there is a self-regulatory
mechanism working that helps to maintain this network structure. These emerging
patterns within the immune network are characterised by varying numbers of B-
cells that when in a response to an antigen undergo clonal selection. The authors
use the term metadynamics of the immune system; see also [30). This can essen-
tially be taken to mean the continuaI productinn and death of immune cells and
molecules. A large variety of new B-celIs will be produced, but not alI will be a
useful addition to the immune system and many will never enter into the dynamics
of the immune system (interact with other B-celIs in the network) and will eventu-
ally die. The authors produced a simple model using these ideas and found that
there are oscillations in many of the variables within their system, in particular the
number of B-cells that are produced. There would often be rapid production of B-
cells, followed by a sharp decline in number, which the authors argue, is what you
expect to see in the natural immune system. Coupled with this oscillatory pattern,
the authors observed that a certain core and stable network structure does emerge
over time. This structure emerges due to a topological self-organisation within the
network, with the resulting network acting to record the history of encounters with
antigens. Therefore, the authors concluded that the immune system is an excellent
system for learning about new items and can support a memory of encounters by
the use of complex pattern matching and a self-organising network structure, and
can thus be thought of as being cognitive.
An Overview of Artificial Irnrnune Systems 61

There is other research that goes to support the ideas presented above. Bersini
and Varela [4] implemented the model proposed by Varela et al, [12] and sug-
gested that mechanisms such as immune memory, adaptability and the immune
system's ability to perform distributed processing could be of potential use to en-
gineering problem solving, in particular adaptive control [31] and computational
problem solving.
Following their earlier work [4] in Bersini and Valera [30] provides an effec-
ti ve summary of work done on exploring the dynamics and metadynamics of the
immune system. They c1aim that the metadynamics of the immune system allows
the identity of the immune system to be preserved over time, but still allows itself
to adapt to new situations. Simulations of an immune network confirmed this. The
reader is also directed to [7] where further arguments for this position are pro-
posed.
As a way to model the immune system metadynamics the authors proposed the
use of the immune recruitment mechanism (lRM). The IRM is a mechanism by
which the best new cells and molecules in the system are incorporated into the
network. This can be translated as saying that one should only incorporate the best
new items that are produced into the network. Therefore the selection of new
items is based on the state of the surrounding network: any other items that are
produced are lost. This gives rise to the metadynamical system that is believed to
occur in the vertebrate immune system. In this paper, the authors proposed seven
general principles that can be extracted from the immune system and applied to
creating a controlling system for the area of adaptive control, but they hope, to
other fields as well. These principles are:
• Principle J: The control of any process is distributed araund many operators in
a network structure. This allows for the development of a self-organising sys-
tem that can display emerging properties.
• Principle 2: The controller should maintain the viability of the process being
controlled. This is keeping the system within certain limits and preventing the
system from being driven in one particular way.
• Principle 3: While there may be perturbations that can affect the process, the
controller leams to maintain the viability of the process through adaptation.
This leaming and adaptation requires two kinds of plasticity: a parametric plas-
ticity, which keeps a constant population of operators in the process, but modi-
fies parameters associated with them; and a structural plasticity which is based
on the recruitment mechanism which can modify the current population of op-
erators.
• Principle 4: The leaming and adaptation are achieved by using a reinforcement
mechanism between operators. Operators interact to support common opera-
tions or controls.
• Principle 5: The dynamics and metadynamics of the system can be affected by
the sensitivity of the network.
• Principle 6: The immune recruitment mechanism can be considered to be a
stand-alone optimisation algorithm.
• Principle 7: The controller retains a population-based memory, which can
maintain a stable level in a changing environment.
62 lTimmis et al.

The authors suggest that these principles, while being very general, could prove
useful to many domains of leaming, engineering control and so ono Indeed, in their
paper they present a way of applying these general principles to the areas of adap-
tive control and to the creation of other immune-inspired algorithms.

3.6 The Clonal Selection Principle

When antibodies on a B-cell bind with an antigen, the B-cell becomes activated
and begins to proliferate. New B-cell clones are produced that are an exact copy of
the parent B-cell, but then undergo somatic hypermutation [Il] and produce anti-
bodies that are specific to the invading antigen. The clon al selection principle [15]
is the term used to describe the basic properties of an adaptive immune response to
an antigenic stimulus and is an alternative view to the position presented in the
previous section. It establishes the idea that only those cells capable of recogniz-
ing an antigenic stimulus will proliferate, thus being selected against those that do
not. Clonal selection operates on both T-cells and B-cells.
The B-cells, in addition to proliferating or differentiating into plasma cells, can
differentiate into long-lived B memory cells. Memory cells circulate through the
blood, lymph and tissues, probably not manufacturing antibodies [32]. However,
when exposed to a second antigenic stimulus they comrnence differentiating into
large lymphocytes capable ofproducing high affinity antibody.

3.6.1 Learning and Memory via Clonal Selection


In order for the imrnune system to be protective over periods of time, antigen rec-
ognition is insufficient. The immune system must also have a sufficient number of
cells and molecules so as to mount an effective response against antigens encoun-
tered at a later stage. The number of imrnune cells and molecules specific for the
antigen with relation to the size of the antigen's population is crucial to determin-
ing the outcome of infection. Leaming via clonal selection involves raising the
population size and the affinity of those cells that have proven themselves to be
valuable during the antigen recognition phase. Thus, the immune repertoire is bi-
ased from a random base to a repertoire that more clearly reflects the actual anti-
genic environment.
In the normal course of the evolution of the immune system, an organism
would be expected to encounter a given antigen repeatedly during its lifetime. The
initial exposure to an antigen that stimulates an adaptive immune response (an
immunogen) is handled by a small number of B-cells, each produc ing antibodies
of different affinity. Storing some high affinity antibody produc ing cells from the
first infectian, so as ta farm a large initial specific B-cell sub-population (clane)
for subsequent encounters, considerably enhances the effectiveness of the immune
response to secondary encounters. These are referred to as memory cells. Rather
than 'starting from scratch' every time, such a strategy ensures that both the speed
and accuracy of the immune response becomes successively stronger after each in-
fection.
An Overview of Artificial Immune Systems 63

In summary, immune leaming and memory are acquired through:


• Repeated exposure to an antigenic stimulus
• Increase in size of specific immune celIs and molecules
• Affinity maturation of the antigenic receptors
• Presence of long living celIs that persist in a resting state until a second encoun-
ter with the antigen

3.7 Self/Non-Self Discrimination

The immune system is said to be complete: it has the ability to recognise alI anti-
gens. Antibodies and T-cell receptors produced by the lymphocytes can recognise
anY foreign (or self) molecule. Antibody molecules have idiotopes and it folIows
from the idea of completeness that these will be recognised by other antibody
molecules.
Therefore, alI molecules (shapes) can be recognised including our own, which
are also seen as antigens, or self-antigens. For the immune system to function
properly, it needs to be able to distinguish between the molecules of our own celIs
(self) and foreign molecules (non-self), which are a priori indistinguishable [33].
If the immune system is not capable of performing this distinction, then an im-
mune response will be triggered against the self-antigens, causing autoimmune
diseases.
An encounter between an antibody and an antigen does not inevitably result in
activation of the lymphocyte. It is possible that the encounter could actually cause
the death of the lymphocyte. In order for this to happen, there must be some form
of negative selection that prevents self-specific lymphocytes from becoming
prevalent.

3.7.1 Negative Selection

The concept of a negative signal folIowing certain lymphocyte-antigen interac-


tions, alIows for the control of those lymphocytes being anti-self. Negative selec-
tion of a lymphocyte describes the process whereby a lymphocyte-antigen interac-
tion results in the death or anergy of that lymphocyte. The immune cell is simply
purged from the repertoire. Location plays a role in negative selection: the primary
lymphoid organs are designed to largely exclude foreign antigens and to preserve
the self-antigens, whereas the secondary lymphoid organs are designed to filter out
and concentrate foreign material, and to promote co-stimulatory intercelIular im-
mune reactions [34].
The negative selection of T-cells has been broadly used by the AIS community
as a model to perform anomaly detection. Basically, the negative selection of T-
cells that occurs within the thymus is based on the following considerations. The
thymus comprises a myriad of molecules that primarily present self-molecules to
the naIve T-celIs (immature T-celIs just produced and with no function yet). The
interactions of immature T-cells with the self-molecules results in the death of alI
64 J.Timmis et al.

those naive T-cells that recognise the self-molecules. This means that only T-cells
that do not recognise self-molecules are allowed to survive and become functional
T-cells.

4 From Natural to Artificial Immune Systems

The immune system is a valuable metaphor as it is self-organising, highly distrib-


uted and has no central point of control. The theoretical aspects summarised above
reveal interesting avenues for using the immune system as a metaphor for devel-
oping novel computational intelligence paradigms. These can potentially be ap-
plied to solve many problems in a wide range of domains, such as data mining,
control and anomaly detection, to name a few. Some of these applications will be
discussed in the following sections. Some of the interesting immunological as-
pects can be summarised as follows:
• Using the idea of self-organisation. Self-organisation is the ability of a system
to adapt its internal structure to the environment without any external supervi-
sion. In the case of the immune system, clonal selection followed by affinity
maturation and the immune network adapts to new antigens it comes across and
ultimately can be said to represent the antigens. This fits in with the general
principle 1 described above of having some inherent self-organising structure
within a system that will exhibit emerging properties.
• The primary and secondary immune responses. It has been shown that more B-
cells are produced in response to continual exposure to antigens. This suggests
that to leam on data using the immune system metaphor, the data may have to
be presented a number of times in order for the patterns to be captured.
• Using the idea of clonal selection. As B-cells become stimulated they repro-
duce in order to create more antibodies to remove the antigen from the system.
This causes clusters of B-cells that are similar to appear. Clusters indicate simi-
larity and could be useful in understanding common patterns in data, just as a
large amount of specific B-cells in the immune system indicates a certain anti-
gen.
• Adaptation and diversification. Some B-cell clones undergo somatic hypermu-
tation. This is an attempt of the immune system to develop a set of B-cells and
antibodies that can remove not only the specific antigen, but also similar anti-
gens. By using the idea of mutation a more diverse representation of the data
being leamt is gained than a simple mapping of the data could achieve. This
may be of benefit and reveal subtle patterns in data that may be missed.
• Knowledge extraction and generalisation. Somatic hypermutation may be not
only beneficial to generali se knowledge, i.e., to reveal subtle patterns in data
but, together with a selective event, it might guarantee that those B-cells with
increased affinities are selected and maintained as high affinity cells. The con-
tinuous processes of mutation and selection (affinity maturation) allow the im-
mune system to extract information from the incoming antigens. Affinity matu-
ration performs a better exploitation (greedy search) of the surrounding regions
of the antibodies.
An Overview of Artificial Immune Systems 65

• The use of a network structure. The immune network represents an effective


way of simulating a dynamic system and achieving memory. This idea could be
exploited in helping to maintain a network of B-cells that are creating a model
of some data being leamt. Indeed, visualising that network may reveal useful
topological information about the network that leads to a greater understanding
of the data being modelled.
• Metadynamics. The oscillations of immune system variables, such as antibody
concentration and B-cell population, as discussed in [12] indicate that a stable
network representative of the data being leamt could be possible. This would be
very useful as once a pattern had been leamt, it would only be forgotten if it be-
comes useless in a far future. Additionally, the networks produced act as a life
long leaming mechanism, with B-cell population aIways in a state of flux, but
representative of antigens it has been exposed to. This could be a useful meta-
phor for developing a system that could, in principle, leam a set of patterns in
one data set, then go onto leam new patterns from other data sets, while still
remembering the older ones.
• Knowledge of self and non-self The immune system has a complete repertoire
in its ability to recognise invading antigens. Additionally, the immune system is
said to be tolerant to self, in that it can recognise the difference between self
and non-self cells. This is a powerful metaphor when considering anomaly de-
tection systems.

4.1 Summary

Immunology is a vast topic; therefore; this chapter has introduced only those areas
of immunology that are pertinent to this contribution. Through a process of match-
ing between antibodies and antigens and the production of B-cells through clonaI
selection [15] and somatic hypermutation [10], an immune response can be elic-
ited against an invading antigen so that it is removed from the system. In order to
remember which antigens the immune system has encountered, some form of im-
munological memory must be present; this can be explained in part through theo-
ries such as the clonaI selection theory or the more controversiaI immune network
theories. Clearly, the immune system is performing a very important role within
the body. The sheer complexity of the system is staggering, and current immunol-
ogy only knows part of the story. Through complex interactions, the immune sys-
tem protects our bodies from infection, interacts with other bodily systems to
maintain a steady state (homeostasis). The focus of this chapter has been more on
the immune network theory. This is not to lend more weight to that particular view
point of the immune system, it has merely been presented in more depth to pro-
vide the reader with a deeper insight into one of the many complex ideas within
immunology, that have helped computer scientists and engineers over the years.
This area will now be examined in more detail.
66 J.Timmis et al.

5 The Immune System Metaphor

This section introduces the reader to the field of Artificial Immune Systems (AIS).
There have been a number of attempts over the years to try and define exactly
what is an AIS. For example, [18] defined AIS to be 'a computational system
based upon metaphors of the natural immune system' and [9] defined them to be
'intelligent methodologies inspired by the immune system toward real-world prob-
lem solving'. Feeling that neither of these definitions was complete, the most re-
cent definition is taken from de Castro and Timmis[l], where they define AIS to
be 'adaptive systems, inspired by theoretical immunology and observed immune
junctions, principles and models, which are applied to problem solving'. In this
latest definition, a more complete view of what AIS are has been captured: the fact
they are inspired by the immune system, but the inspiration is not restricted to
purely theoretical immunology, but also 'wet lab' type immunology, the systems
are adaptive which means they must demonstrate some element of adaptability
and are not restricted to pieces of software but could equally be implemented on
hardware and that there is some form of application ultimately in mind - this al-
lows for the distinction between the creation of pure models of the immune system
(which indeed are useful for AIS, as has been discussed).
This section presents an overview of many different applications of AIS that
can be seen in the literature. No attempt has been made on an exhaustive survey;
for this the readers are directed to de Castro and Timmis[I], Chap. 4, where such
an exhaustive review is presented. The aim of this section is to merely illustrate
the wide applicability of AIS. Very recently, de Castro and Timmis[l] have pro-
posed the idea of a framework for AIS which consists of basic components and
processes, from which it is possible to both describe and build AIS. This frame-
work is now presented - due to the fact that it was only recently proposed, how-
ever, the framework has not been used in this artic1e when describing AIS litera-
ture published before the existence of this framework.

5.1 A Framework for AIS

In an attempt to create a common basis for AIS, de Castro and Timmis[1] pro-
posed the idea of a framework for AIS. The authors argued the case for proposing
such as framework from the standpoint that in the case of other biologically in-
spired approaches, such as artificial neural networks (ANN) and evolutionary al-
gorithms, such a basic idea exists and helps considerably with the understanding
and construction of such systems. For example, de Castro and Timmis[l] consider
a set of artificial neurons, which can be arranged together so as to form an artifi-
cial neural network. In order to acquire knowledge, these neural networks undergo
an adaptive process, known as leaming or training, which alters (some of) the pa-
rameters within the network. Therefore, the authors argued that in a simplified
form, a framework to design an ANN is composed of a set of artificial neurons, a
pattern of interconnection for these neurons, and a leaming algorithm. Similarly,
the authors argued that in evolutionary algorithms, there is a set of 'artificial
chromosomes' representing a population of individuals that iteratively suffer a
An Overview of Artificial Immune Systems 67

process of reproduction, genetic variation, and selection. As a result of this proc-


ess, a population of evolved artificial individuals arises. A framework, in this case,
would correspond to the genetic representation of the individuals of the popula-
tion, plus the procedures for reproduction, genetic variation, and selection. There-
fore, the authors adopted the viewpoint that a framework to design a biologically
inspired algorithm requires, at least, the following basic elements:
• A representation for the components of the system.
• A set of mechanisms to evaluate the interaction of individuals with the envi-
ronment and each other. The environment is usually simulated by a set of input
stimuli, one or more fitness function(s), or other mean(s).
• Procedures of adaptation that govem the dynamics of the system, i.e. how its
behaviour varies over time.
Adopting this approach, de Castro and Timmis[l] proposed such a framework
for Ars. The basis of the proposed framework is therefore a representation to cre-
ate abstract models of immune organs, cells, and molecules, a set of functions,
termed affinity functions, to quantify the interactions of these 'artificial elements',
and a set of general-purpose algorithms to govem the dynamics of the AIS.
Solution

Immune Aigorithms

AIS Affinity Measures

Representation

Application Domain

Fig. 4. A framework for AIC © de Castro and Timmis [1]

The framework can be thought of as a layered approach as shown in Fig. 4. In


order to build a system, one typically requires an application domain or target func-
tion. From this basis, the way in which the components of the system will be repre-
sented will be considered. For example, the representation of network traffic may
weB be different that the representation of a real time embedded system. Once the
representation has been chosen, one or more affinity measures are used to quantify
the interactions of the elements of the system. There are many possible affinity
measures (which are partially dependent upon the representation adopted), such as
Hamming and Euclidean distances. The final layer involves the use of algorithms,
which govem the behaviour (dynamics) of the system. Here, in the original frame-
work proposal, algorithms based on the following immune processes were pre-
sented: negative and positive selection, clonal selection, bone marrow, and immune
network algorithms. It is not possible to explore these here in any detail, needless to
say that each algorithm has its own particular use, or more than one use. For exam-
ple, the immune network model proposed in the framework has been successfully
applied to data mining l35] and with slight adaptations, multi-modal optimisation
[36].
68 J.Timmis el al.

5.2 Machine Learning

5.2.1 Recognising DNA


The past number of years has seen a steady increase in attempting to apply the
immune metaphor to machine leaming [37]. Amongst the first was that performed
by Cooke and Hunt [20,38]. In these papers, the authors describe their attempts to
create a supervised machine leaming mechanism to classify DNA sequences as ei-
ther promoter or non-promoter classes, by creating a set of antibody strings that
could be used for this purpose. Work had already been done on this classification
problem using different approaches such as C4.5 [39] standard neural networks
and a nearest neighbour algorithm [40]. The authors claimed that the AIS system
achieved an error rate of only 3% on classification, which, when compared to the
other established techniques, yielded superior performance. The system created
used mechanisms such as B-cells and B-cell stimulation, immune network theory,
gene libraries, mutation and antibodies to create a set of antibody strings that
could be used for classification. Central to the work was the use of the immune
network theory [2J.
Hunt and coworkers [41,42] attempted to apply this algorithm to the domain of
case base reasoning. In these papers, the authors proposed creating a case memory
organisation and case retrieval system based on the immune system. Hunt et al.
[43] took the application to case base reasoning and attempted to apply it directly
to data mining. In the previous work [41], only cases with no variations were ex-
plicitly represented" but as indicated by the authors in [43], a desirable property of
any case base system is the ability to generalise; that is, to retum a case that is a
general solution if no specific solution is available. As the immune system creates
generality in the fight against infection, the authors used this as inspiration to cre-
ate the idea of a general case, which would attempt to identify trends in data, as
opposed to simply the data themselves. By introducing the idea of a generali sed
case, the authors created a system that could help in the customer-profiling do-
main; specifically, identifying people who are likely to buy a personal equity plan
(PEP) which was a tax-free investment available at the time.

5.2.2 Fraud Detection


This algorithm was then applied to fraud detection [43-45]. Hunt et al. [43] sim-
ply proposed the idea that an AIS could be used to create a visual representation of
Ioan and mortgage application data that could in some way aid the process of 10-
cating fraudulent behaviour. An attempt at creating such a system was proposed in
[44]. This system, called JISYS, did not differ substantially from that described in
[43] apart from the application and the inclus ion of more sophisticated string
matching techniques, such as trigram matching and the inclus ion of weighting in
order of importance various fields in the B-cell object, taken from the weighted
nearest neighbour idea [40].
An Overview of Artificial Irnrnune Systems 69

5.2.3 Back to Basics


Timmis et al. [46] developed an AIS inspired by the immune network theory,
based on work undertaken by Hunt et al. [43]. The proposed AIS consisted of a set
of B-cells, links between those B-cells, and cloning and mutation operations that
are performed on the B-cell objects. The AIS is tested on the well-known Fisher
Iris data set. This data set contains three c1asses, of which two are not linearly
separable. Each B-cell in the AIS represents an individual data item that could be
matched (by Euc1idean distance) to an antigen or another B-cell in the network
(according to Jeme's immune network theory). The links between the B-cells were
ca1culated by a measure of affinity between the two matching cells. If this affinity
is above the network affinity threshold (NA T) it could be said that there is enough
similarity between the two cells for a link to exist. The strength of this link is pro-
portional to the affinity between them. A B-cell also has a certain level of stimula-
tion that is related to the number and to the strength of links a cell has. The AIS
also had a c10ning mechanism that produced randomly mutated B-cells from B-
cells that became stimulated above a certain threshold. The c10ning mechanism is
inspired by somatie hypermutation that produces mutated cells in the human body.
The network is trained by repeatedly presenting the training set to the network.
The AIS produced some encouraging results when tested on the Fisher Iris data set
[47]. The proposed system successfully produced three distinct c1usters, which
when presented with a known data item could be c1assified. However, although
the c1usters were distinct there was still a certain amount of connection between
Iris Virginiea and Iris Versieolor. The AIS also experienced an uncontrolled popu-
lation explosion after only a few iterations, suggesting that the suppression
mechanism (culling 5% of the B-cells) could be improved. This work was com-
pared to other traditional c1uster analysis techniques and Kohonen networks [48]
and found to compare favourably [49].
This work was then taken further by Timmis and Neal [50]. In this paper the
authors raise and address a number of problems conceming the work in [46]. A
number of initial observations were c1ear: The network underwent exponential
population explosion; the NAT eventually became so low that only very similar, if
not identica!, c10nes can ever be connected; the number of B-cells removed from
the system lags behind the number created to such an extent that the population
control mechanism was not effective in keeping the network population at a sensi-
bIe level; the network grew so large that it becomes difficult to compute each it-
eration with respect to time; the resultant networks were so large, they were diffi-
cult to interpret, and were really too big to be a sensible representation of the data.
With these concems in mind, the authors proposed a new system called RLAIS
(resource limited artificial immune system). This was later renamed AINE (artifi-
cial immune network). To summarise work in [50] AINE is initialised as a net-
work of ARB objects (artificial recognition balls); T-cells, again, are currently ig-
nored. Links between ARBs are created if they are below the NAT, which is the
average Euc1idean distance between each item in the data set. The initial network
is a cross-section of the data set to be leamt, the remainder makes up the antigen
training set. Each member of this set is matched against each ARB in the network,
again, with the similarity being ca\culated on Euc1idean distance. ARBs are stimu-
lated by this matching process and by neighbouring ARBs in the network. Again,
70 lTimmis et al.

a certain amount of suppression is included in the ARB stimulation level calcula-


tion. The equation used as a basis for B-cell stimulation ca1culation was ba sed on
Equation (1). The stimulation level of an ARB determines the survival of the B-
ceH. The stimulation level also indicates if the ARB should be cloned and the
number of clones that are produced for that ARB. Clones undergo a stochastic
process of mutation in order to create a diverse network that can represent the an-
tigen that caused the cloning as well as slight variations. There exist a number of
parameters to the algorithm, those being: network affinity scalar; mutation rate
and number of times the training data are presented to the network. Each one of
these can be used to alter algorithm performance. The population control mecha-
nism, which replaced the 5% culling mechanism, forces ARBs to compete for sur-
vival based on a finite number of resources that AINE contains; the more stimu-
lated an ARB, the more resources it can claim. Once an ARB no longer claims any
B-cells, it is removed from the AINE. Previously, always 5% was removed, with
AINE this is not the case, a predetermined number is not set for removal and the
amount removed depends on the performance of the algorithm. This gives rise to a
meta-dynamical system that which will extract pattems or clusters from data being
leamt. The authors propose that AINE is a very effective leaming algorithm, and
on test data so far, very encouraging results have been obtained. The authors test
the system on a simulated data set and the Iris data set. With the Iris data set, three
distinct clusters can be obtained, unlike the original AIS proposed. Additionally,
the networks produced by AINE are much smaller than the original system. In ef-
fect, AINE is acting as a compression facility, reducing the complexity of the net-
works, so as to highlight the important information, or knowledge, that can be ex-
tracted from the data. This is achieved by a special visualisation tool outlined in
[51]. More details of these algorithms can be found in [18,50]. However, more re-
cent work has shown that the networks produced by AINE suffer strong evolu-
tionary pressure and converge to the strongest class represented in the data [52].
Whilst this is an interesting development that could potentially be applied to opti-
misation, with regard to data mining it would not be preferential. From a continu-
ous leaming point-of-view it is more desirable if all pattems persist over time
rather than the strongest. Neal[53] has developed a form of the original algorithm
that is capable of finding stable clusters. Here, a different population control
mechanism based on exponential decay of stimulation level ca1culations is used
and the system allows for the continualleaming of clusters of information, even in
the absence of antigenic input.

5.2.4 Multi-Layered Immune Inspired Learning


In paraHel to this work Knight and Timmis [54] have developed a multi layered
immune inspired algorithm for data mining. The motivation for this work was to
take a step back from existing work and attempt to take a more holistic approach
to the development of an immune inspired algorithm. It was noted that a more ho-
listic approach might provide a better solution in the search for an immune in-
spired data-mining algorithm capable of continuous leaming. Rather that focusing
on the immune network theory the authors adopted aspects of the primary and
secondary responses seen in the adaptive immune system. This new approach in-
An Overview of Artificial Immune Systems 71

corporates interactions between free antibodies, B-cells, and memory cells, using
the clonal selection processes as the core element of the algorithm. This three-
layered approach consists of a free-antibody layer, B-cell layer and a memory
layer. The free antibody layer provides a general search aud pattern recognition
function. The B-cell layer provides a more refined pattern recognition function,
with the memory layer providing a stable memory structure that is no longer influ-
enced by strong evolutionary pressure. Central to the algorithm is feedback that
occurs between B-cells and is part of the secondary immune response in the algo-
rithm. Novel data are incorporated into the B-cell layer and are given a chance to
thrive, thus providing a primary immune response. Initial testing of this algorithm
has shown good performance at static clustering.

5.2.5 Data C/ustering


Similar work to that of Timmis and Neal [50] has been undertaken by de Castro
and Von Zuben [35]. In this work the authors propose a system called aiNet, the
driving force of which is data clustering and filtering redundant data. Again, for
inspiration the authors utilise the immune network theory and the idea of shape
space. The proposed aiNet is likened to a weighted disconnected graph, where
each cell represents a set of variables (attributes or characteristics) which is said to
characterise a molecular configuration, hence a point in p-dimensional space
(shape space). Cells are allowed connections between them based on some simi-
larity measure. Suppression within aiNet is achieved by eliminating self-similar
cells under a given threshold (defined by the user). Cells within aiNet compete
with each other for recognition of antigens (training data) and if successful prolif-
erate and are incorporated into the emerging network. The algorithm is as follows:
the training data are presented to an initial randomly generated network. Affinity
between antigens and network cells is ca1culated and the highest matched cells are
cloned and mutated. A heuristic is placed in the algorithm that increases the
weighting of well-matched cells by decreasing their distance between the antigen
items; this is akin to a greedy search. The affinity between these cells in this new
matrix is then ca1culated with the lowest matched cells being removed (this is
based on a predetermined threshold set by the user). A certain number of cells are
then removed from the network; again, based on a threshold value predetermined
by the user, the new clones are then integrated into the network. The cells in the
network then have their affinities with each other recalculated, with again a certain
number being removed, that fall under the user defined threshold. After the learn-
ing phase, the network can be said to be a representation of the data set that is be-
ing learnt. Clusters and patterns will emerge within the network and can be used
for knowledge extraction. Once the networks have been created, the authors then
use a variety of statistical techniques for interpreting the networks. The authors'
main goal for aiNet is two-fold: identify the number of clusters within the data and
determine which network cell belongs to which cluster. To achieve this, the au-
thors apply the minimal spanning tree algorithm to the network. The authors test
their system on two data sets, a simple five linearly separable data set and the fa-
mous Donut problem. Good results are obtained for each of the experiments, aiNet
identifies the clusters within the data and manages to represent those clusters with
72 J.Timmis et al.

a reduced number of points; thus reducing the complexity of the data. Work by de
Castro and von Zuben [55] explores the possibility of using immunological meta-
phors for Boolean competitive networks.

5.2.6 Inductive Learning


Researeh by Slavov and Nikoleav [56] attempted to create an inductive computa-
tion algorithm based upon metaphors taken from immunology. In their paper, they
describe an evolutionary search algorithm based oh a model of immune network
dynarnics. By imitating the behaviour of constantly creating and removing good
solutions, coupled with attempts to create a diverse range of solutions, the algo-
rithm achieved high diversity and efficient search navigation. These dynamic fea-
tures were incorporated in the fitness function of the immune algorithm in order to
achieve high diversity and efficient search navigation. The authors c1aim an effi-
cient and effective solution when compared to more traditional GAs.

5.2.7 Sparse Distributed Memory


Rart and Ross [57-59] have used an immune system metaphor to address the
problem of finding and tracking c1usters in non-static databases. They note that, in
order to be ultimately useful in the real world, a successful machine-leaming (ML)
algorithm should address the following characteristics observed in very large, real-
world databases:
• Databases are non-static; data is continually added and deleted
• Trends in the data change over time
• The data may be distributed across several servers
• The data may contain a lot of 'noise'
• A significant proportion of the data may contain missing fields or records
The biological immune system performs remarkably well in a dynarnic envi-
ronment; the system is continuously exposed to a variety of ever changing patho-
gens, and it must adapt quickly and efficiently in order to counteract them. More-
over, the biological immune system is robust to noisy and incomplete information.
Therefore the metaphor embodies exactly those characteristics that it is proposed a
good ML algorithm must contain. Rart and Ross's work combines an immune sys-
tem metaphor with that of another c1ass of associative memories - the sparse dis-
tributed memory (SDM). This type of memory was frrst suggested by Kan-
erva[60], and since then Smith et al. [16] have shown that the immune system and
SDM can be considered analogous. The SDM is a robust memory that derives its
properties from the manner in which it performs sparse sampling of huge input
spaces by a small number of recognition units (equivalent to B-cells and T-cells in
the immune system), and from the faet that that the memory is distributed amongst
many independent units. This is analogous to the memory population of the IS
which again consists of B-cells and T-cells.
In brief, an SDM is eomposed of a set of physical or hard locations, each of
which recognises data within a specified distance of itself - this distance is known
An Overview of Artificiallmmune Systems 73

as the recognition radius of the location, and alI data recognised are said to lie
within the access circle of the location. In the case of storing binary data, this dis-
tance is simply interpreted as Hamming Distance. Each location also has an asso-
ciated set of counters, one for each bit in its length, which it uses to 'vote' on
whether a bit recalled from the memory should be set to 1 or O. An item of data is
stored in the memory by distributing it to every location which recognises it - if
recognition occurs, then the counters at the recognising locations are updated by
either incrementing the counter by 1 if the bit being stored is 1, or decrementing
the counter by 1 if the bit being stored is O. To recall data from the memory, alllo-
cations, that recognise an address from which recall is being attempted vote by
summing their counters at each bit position; a positive sum results in the recalled
bit being set to 1, a negative sum in the bit being set to O. This results in a mem-
ory, which is particularly robust to noisy data due to its distributed nature and in-
exact method of storing data.
These properties make it an ideal candidate as a basis for building an immune
system based model for addressing clustering problems in large, dynamic data-
bases. For example, we can consider each physicallocation along with its recogni-
tion radius to define a cluster of data; the location itself can be considered to be a
concise representation or description of that cluster, and the recognition radius
specifies the size of the cluster. Clusters can overlap - indeed, it is this precisely
this property that allows all data to be recognised with high precision whilst main-
taining a relatively low number of clusters. This has a direct parallel in the bio-
logical immune system in which antibodies exhibit cross-reactivity. If no overlap
were allowed in an SDM, then a large number of locations would be required to
cluster the data, the system would become overly specific, and hence general
trends in the data would be lost. The analogy between the immune system and the
SDM class of associative memories is detailed in Table 1. , taken from Smith et al.
[16].

Table 1. Analogy between the immune system memory and SDM

Immunolo2ical Memory SDM


Antigen Address/Data
Bff CeH Hard Location
BaU of Stimulation Access Cirele
Affinity Hamming Distance
Primary Response Write and Read
Secondary Response Read
Cross-Reactive Response Associative RecaB

In its original form however, the SDM is a static form of memory, and is built
on several assumptions that make it unsuitable to use directly as a model for data
clustering. In brief, these assumptions are that the addresses of the hard locations
are randomly chosen and fixed from the start, and that the recognition radii of
each address are equal and constant. Hart and Ross first addressed these problems
in a system named COSDM (Hart and Ross, 2001) in which they adapted a co-
74 J.Timmis et al.

evolutionary genetic algorithm architecture first proposed by (Potter & De Jong,


2000), cGA, to form an immune system based model capable of clustering static
and dynamic data-sets. cGA is another data-clustering algorithm which uses an
immune-system metaphor to categorise a benchmark set of data, (Congress Voting
records), and performs very well compared to more classical categorisation tech-
niques such as ID3.
In COSDM, an antigen represented an item of data and an antibody defined a
hard location and its recognition radius. The antibodies co-operate to form an
SDM type of memory in which antigen data can be stored. The system consisted
of a number of populations of potential antibodies - each population contributed
one antibody to the memory A co-evolutionary GA was used to find quickly the
'location' of the antibodies and the size of their corresponding balls of stimulation
in order to best cluster the data currently visible to the system. If an antibody rec-
ognised an antigen, the antigen was 'stored' by that antibody. The accuracy of
clusters produced was determined by attempting to recall each antigen and then
comparing the results to the actual data in the database. Antibody populations
were added and deleted dynamically - if the best member of a population did not
make a significant contribution to the memory, then the population was deleted.
Similarly, if the system was not able to improve the clustering accuracy over a
predetermined number of generations, then a new population was added. This sys-
tem was tested on a number of benchmark static and dynarnic data-sets - although
it showed some promise on dustering dynamic data-sets, it was outperformed by
the immune system of [61] on large, static data-sets. The difficulties arose in
evolving a suitable size for the ball of recognition of each antibody, which led to
some antigen never being recognised by any of the antibodies in the system. Also,
the system required large numbers of evaluations to find a reasonable SDM, due to
the nature of the co-evolutionary architecture.
Hart and Ross [58,59] thus tackled these issues in which they describe a system
based on an SDM as in COSDM, but in which the architecture is akin to that used
in a self-organising map, and thus the system is called SOSDM (Self-Organising
SDM). A diagram of SOSDM is shown in Fig. 5. In this system, the recognition
radius is replaced by a mechanism in which ali antibodies in the system compete
for antigen data; antigens bind to ali those antibodies for which they have an affin-
ity greater than some preset affinity threshold, with a strength proportional to their
affinity. Thus, the binary counters in the SDM are replaced with real-valued
counters, and updated according to the strength of the binding. Each antibody ac-
cumulates a measure of its own error, that is, how distant are the antigens recog-
ni sed by itself from its own description (based on Hamming Distance between the
antibody and the antigen). This quantity is then used to allow the antibodies to
self-organise, that is, antibodies gravitate towards regions of the space in which
they best recognise antigen. The counters also move with each antibody, but decay
over time, thus they contain a historical record of data that has been recognised by
the antibody. As in COSDM, new antibodies are added periodically, and antibod-
ies can also be deleted dynamically. SOSDM is thus truly adaptive and self-
organising, and as such encapsulates some of the most important features of the
biological immune system.
An Overview of Artificial Immune Systems 75

Bball of recognition - binding


occurs within this sphere

Database

lnput antigen
data

antibody

SOSDM
Fig. 5. Diagrammatic representation of the SOSDM model

SOSDM has been shown to outperform other published immune algorithms on


benchmark static data sets, and furthermore performance has been shown to scale
both the size of the data set and with the length of the antigens within the data set.
It was also tested on data sets, which contained known clusters of unequal sizes,
and was shown to be satisfactory at detecting small clusters.
SOSDM was also tested on a number of time-varying data sets. The experi-
ments tested scenarios, which are likely to represent the extremes of those scenar-
ios, which might realisticallY occur in a real-world situation. Thus, one set exam-
ined scenarios in which data in one c1uster was gradually replaced with new data,
but still belonging to the same c1usters, whereas the other set examined cases
where whole c1usters were suddenly deleted and replaced by entirely new clusters
containing different data. SOSDM performed well at both tasks, though some loss
in recall accuracy was observed as the number of c1usters being replaced was in-
creased. SOSDM was also shown to exhibit a basic form of memory; when re-
exposed to familiar antigens, it reacted more rapidly than to previously unseen an-
tigen. The system appeared relatively robust to the period of the memory. In
summary, SOSDM provides a scalable, fast and accurate way of c1ustering data,
but also builds on the analogy between the SDM and the immune system frrst pre-
sented by Smith et al. [16] to produce a system that is more faithful to the princi-
ples of the biological system than the original analogy suggested.

5.2.8 Supervised Learning with Immune Metaphors


earter [62] made use of the immune network theory to produce a pattern recogni-
tion and c1assification system. This system was known as Immunos-81. The au-
thor's aim was to produce a supervised leaming system that was implemented
based on high levels of abstraction on the workings of the immune system.
76 lTimmis et al.

The model consisted of T-cells, B-cells, antibodies and an amino-acid library.


Immunos-81 used the artificial T-cells to control the production of B-cells. The B-
cells would then in turn compete for the recognition of the 'unknowns'. The
amino-acid library acts as a library of epitopes (or variables) currently in the sys-
tem. When a new antigen is introduced into the system, its variables are entered
into this library. The T-cells then use the library to create their receptors that are
used to identify the new antigen. During the recognition stage of the algorithm T-
cell paratopes are matched against the epitopes of the antigen, and then a B-cell is
created that has paratopes that match the epitopes of the antigen.
Immunos-81 was tested using two standard data sets, both of these from the
medical field. The first set was the Cleveland data set, which consists of the results
of a medical survey on 303 patients suspected of having coronary heart disease.
This data set was then used as a training set for the second data set; a series of 200
unknown cases. Immunos-81 achieved an average classification rate of 83.2% on
the Cleveland data set and approximately 73.5% on a second data set. When com-
pared to other machine leaming techniques, Immunos-81 performed very weB.
The best rival was a k-nearest neighbour classifier [63], which averaged 82.4% on
the Cleveland data set, whereas other clustering algorithms [64] managed 78.9%
and using C4.5 only 77.9% accuracy was obtained. The authors therefore argue
that Immunos-81 is an effective classifier system, the algorithm is simple and the
results are transparent to the user. Immunos-81 also has the potential for the ability
to learn in real-time and be embeddable. It has proved to be a good example of us-
ing the immune system as a metaphor for supervised machine leaming systems.
Watkins [65] proposed a resource limited artificial immune system classifier
model using as a basis work by Timmis [18] and de Castro and von Zuben [66].
Here the author extracted metaphors such as resource competition, clonal selection
and memory cell retention to create a classification model named AIRS. Results
presented in this work are very encouraging. Benchmark data sets such as Fisher
Iris data set, Ionosphere data set and sonar data sets were used to test the effec-
tiveness of the algorithm. AIRS was found to perform at the same level of accu-
racy as some other well established techniques, such has C4.5, CART etc. Recent
work has highlighted several revisions that could be made to the original algo-
rithm [67]. The work highlighted that the internal data representation of the data
items was over-complicated and by simplifying the evolutionary process it was
possible to decrease the complexity whilst stiH maintaining accuracy. The authors
also adopt an affinity aware somatie hypermutation mechanism to which they also
attribute improved quality of memory cells and therefore greater data reduction
and faster classification.

5.3 Robotics

Attempts have been made to apply the immune network idea to control large
populations of robots to have some form of self-organising group behaviour. Work
by Mitsumoto et al. [68] attempts to create a group of robots, which behave in a
self-organising manner, to search for food without any global control mechanism.
Central to their idea is the interaction between robots at the local level. The au-
An Overview of Artificial Immune Systems 77

thors use three main immunological metaphors. The first is B-cells, where a robot
represents a B-cell and each robot has a particular strategy on how to find food.
The second is the immune network, allowing for interaction between robots. The
third is the calculation of B-cell stimulation, where the more the robot is stimu-
lated, then the better its strategy is considered to be. In order to calculate B-cell
(robot) stimulation a modified version of Eq. (1) is used, where the robot is stimu-
lated and suppressed by neighbouring robots and stimulated by the outside envi-
ronment. Each robot carries a record of its degree of success in collecting food,
while neighbouring robots compare their success and strategies and stimulate and
suppress each other accordingly. If a robot's stimulation level is considered low,
then the strategy is considered too weak and,losing that strategy, randomly selects
another. If the robot is weB stimulated, the strategy is considered to be good and is
preserved. Over time the robots interact and successfully achieve the food collec-
tion. The authors claim good results on their test data, but indicate the need for
further research and testing.
This work is advanced by Mitsumoto et al. [69]. where similar techniques were
applied to create a group of robots to interact and achieve the transportation of
multiple objects to multiple locations. The algorithm is very similar to the first: the
B-cell is represented by a robot, the work to be done by the robots being analo-
gous to antigens, and communication between robots is achieved via the network.
The idea of B-cell cloning is also introduced into the algorithm, which is used to
represent messages to other robots. Here, a robot is stimulated by interaction be-
tween other neighbouring robots and the work environment. If a robot is achieving
the work, then it receives more stimulation. If that robot becomes well stimulated,
it produces clone B-cells that contain information about the work it is doing, since
it is considered to be good work. Other robots in the network then match these
and, if they share similar work, they become stimulated and produce other similar
work B-cells. If they do not match well, the robot will attempt to adapt its work to
the most common work strategy it encounters. Both this interaction and passing of
messages enables a group behaviour to emerge that can solve the transportation
problem. It was also shown by the authors that this is successful if the work re-
mains static or if the work requirement changes over time.
In very similar work by Lee et al. [70], the immune network metaphor is ap-
plied to creating swarm strategies for mobile robots. However, this work is virtu-
ally identical to that presented above. The authors do extend the concept in (Lee et
al, 1999) who introduce the metaphor of the T-cell into the algorithm. They pro-
pose a modified version of Eq. (1) with the addition of the T-cell metaphor. How-
ever, the authors fail to include the results of using the modified equation in their
simulation results, presenting instead results of only using the equation without the
T-cell interaction.
Work by Watanabe et al. [71] and Kondo et al. [72] attempts to create a mecha-
nism by which a single, self-sufficient autonomous robot, the immunoid, can per-
form the task of collecting various amounts of garbage from a constant1y changing
environment. The environment for the immunoid consists of garbage to be col-
lected, and a home base consisting of a wastebasket and a battery charger. The au-
thors use the metaphors of antibodies, which are potential behaviours of the im-
munoid, antigens, which are the environmenta1 inputs such as existence of
78 J.Timmis el al.

garbage, wall and home bases and the immune network, which is used to support
good behaviours of the immunoid. In order for the immunoid to make the best
strategy decision, the immunoid detects antigens and matches the content of the
antigen with a selection of aH the antibodies that it possesses. For example, the
immunoid may have antibodies that are suitable for when a wall is met head-on
and therefore needs to turn right. Each antibody of the immunoid records its con-
centration level, which is calculated using Eq. (1). A number of antigens (envi-
ronmental inputs) are detected and the levels of antibodies are calculated and the
antibody with the highest concentration is selected as the appropriate behaviour to
employ. In experimental results, the authors prepared 24 antibodies for the immu-
noid (potential behaviours) and observed good results. The authors then extended
this work. This was an attempt to create more emergent behaviour within the net-
work of robots [71] by the introduction of genetic operators.

5.4 Fault Diagnosis and Tolerance

The field of diagnosis is a vast field driven by the requirement to accurately pre-
dict or recover from faults occurring in plant. One approach to detect abnormal
sensors within a system [73] has been to use the combination of Leaming Vector
Quantization (LVQ) [74] and the immune network metaphor. The idea behind the
system is to use LVQ to determine a correlation between two sensors from their
outputs when they work properly, and then use an immune network to test sensors
using extracted correlations. Within the system, each sensor corresponds to a B-
ceH and sensors test one another' s outputs to see whether or not they are normal.
Each sensor calculates a value based on an adapted version of Equation 1 where
the inputs to the equation are reliability of the sensor, rather than similarity to the
neighbour. A sensor that has a low value is considered to be fau1ty and can there-
fore be flagged for needing repair. Using this method has the advantage of having
no overall control mechanism for checking for faulty sensors; they can detect for
themselves when they are faulty. Simulations of their system showed the potential
for good diagnostic results, and the paper points the way forward for more re-
search and actual application to real plants.
Aiso in the field of diagnosis, there has been an interest in creating other dis-
tributed diagnostic systems. Initial work by Ishida et al. [5, 75] proposed a parallel
distributed diagnostic algorithm. However, the authors likened their algorithm to
that of an immune network, due to its distributed operation, and the systems emer-
gent co-operative behaviour between sensors. This work was then continued by
Ishida et al. [76,77] and active diagnostic mechanism [78]. The work in [78]
builds on foundations laid in the others so will be briefly examined here.
Active diagnosis continually monitors for consistency between the current
states of the system with respect to the normal state. The authors argue that the
immune system metaphor is a suitable idea for creating an effective active diag-
nostic system. Central to their idea is the immune network theory, where each sen-
sor can be equated with a B-cell [73]. Sensors are connected via a network (the
immune network), with each sensor maintaining a record of sensory reliability,
which is continuaHy changed over time - creating a dynarnic system. Sensors in
An Overview of Artificial Immune Systems 79

the network can test each other for reliability, but where this work differs from the
above is the way in which the reliability of each sensor is calculated. This will not
be explored here. The key features of the immune system that is used by this work
are distributed agents that interact with each other in parallel (each agent only re-
acting on its own knowledge and not via a central controller), and the creation of
memory of the sensor state formed via a network.
Hardware fault tolerant systems seek to provide a high degree of reliability and
flexibility even in the presence of errors within the system. The said system must
be protected from a variety of potential faults, manifesting in such forms as per-
manent stuck at faults or intermittent faults.
Bradley and Tyrrel 1[79] proposed what they called Immunotronics (immu-
nological electronics) in order to implement a finite state machine based counter
using imrnune principles. Their proposed system relied upon the negative selec-
tion algorithm that is responsible for creating a set of tolerance conditions to
monitor changes in hardware states. They employed a binary Hamming shape-
space to represent the tolerance conditions.
Recent work Timmis et al. [80] discusses important issues when considering
the design of immune inspired fault tolerant embedded systems. The authors high-
light that one advantage of using a technique based on AIS in comparison to tradi-
tional fault tolerant approaches, is the possibility to exploit the evolutionary prop-
erty of the immune system. While conventional fault tolerant techniques generate
static detectors that have to be updated offline, AIS-based techniques will enable
the development of adaptable fault tolerant systems, in' which error detectors may
evolve during runtime. This feature will increase the availability of embedded sys-
tems since acceptable variations of non-erroneous states can be integrated to the
self-system. For example, external factors (e.g. temperature) induce changes that
might have significant effects on the system functionalities, while internal changes
(e.g. component replacement) could give rise to variability in self that must be no-
ticed. The authors also argue that AIS techniques, however, pose some challenges.
One of them is the need to ensure that the detectors generated fully cover the non-
self space (i.e. the erroneous states). This is determined by the mode of detector
generation, which in turn affects the resulting detector set as well as the speed of
the operation. However, the distribution of the self-data can be exploited to en-
hance the process. Other metaphors of the immune system are also tagged as po-
tential avenues for research in this area such as the adaptability feature, which is
inherent in the immune network metadynamics [7].

5.5 Optimisation

In order to address the issue of designing a Genetic Algorithm (GA) with im-
proved convergence characteristics, particularly in the field of design constraints,
Hajela et al. [81] proposed a GA simulation of the immune system. The motiva-
tion for their work stems from the fact that genetic algorithms, when applied to
design constraints, have been found to be very sensitive to the choice of algorithm
parameters, which can ultimately affect the convergence rate of the algorithm. The
authors use the idea of antibody-antigen binding to define a complex matching
80 J.Timmis et al.

utility to define similarity between design solutions. This is based on work by


Farmer et al. [27], (Sec. 3.3.1.1) and is simply a bit by bit match for continuous
regions. The model created also simulates the dynamics of the immune system by
creating and removing possible new solutions. Some solutions will be more spe-
cific to the problem areas, whereas others will be more generali sed. However, the
authors point out that both specialist and general solutions are important in the
context of structural design, so they introduce a control parameter into the algo-
rithm that enables them to control the production of specialist and general case so-
lutions. The authors suggest their algorithm leads to a higher convergence rate
when compared to a traditional GA, but indicate the need for further research and
application. It should be noted, however, that while the authors claim to use the
immune network as a metaphor, in reality they use the immune system, as there is
no apparent network interaction going on during the algorithm.
The above work focused on a specific search problem in a particular domain;
Toma et al. [82] adopts a more generic approach to adaptive problem solving by
the use of the immune network metaphor. Again, the authors claim the use of a
network structure, but do not present the work as such, but simply immune system
metaphors including B-ceHs, T-cells, macrophages and the Major Histocompati-
bility Complex (MHC). The immune algorithm given in the paper is used to pro-
duce adaptive behaviours of agents, which are used to solve problems. The algo-
rithm is then applied to the n-TSP problem, and for small-scale problems achieves
good resuIts. The authors also experiment with removing the interaction of the T-
ceH in the searching algorithm and present convincing results that the effect of the
T-cell on performance is significant, as the solutions found with using the T-cell
result in lower cost solutions overall. Other, similar application of the immune
network metaphor for multi-modal function optimisation can be found in [83-85].
Here the authors use somatic hypermutation and immune network theory to create
and sustain a diverse set of possible solutions in the search space and combine it
with traditional genetic algorithms. The authors propose that their algorithm pos-
sess two main characteristics: (i) the ability to create a diverse set of candidate so-
lutions, and (ii) is a parallel-efficient search. Combined with the somatic mutation,
the authors also employ standard genetic algorithm mutation operators of cross-
over and mutation. The authors apply their algorithm to finding optimal solutions
to various functions and compare the result obtained with a standard GA ap-
proach. They argue that the strength of their algorithm lies in its ability to main-
tain a higher diversity of candidate solutions compared to a standard GA, which is
important when attempting to find the global maximum on any search surface.
De Castro and Von Zuben [66] focused on the clonal selection principle
and affinity maturation process of an adaptive immune response to develop
an algorithm suitable to perform tasks like machine leaming, pattern rec-
ognition, and optimisation. Their algorithm was evaluated in a simple bi-
nary character recognition problem, multi-modal optimisation tasks and a
combinatorial optimisation problem; more specifically the travelling
salesman problem (TSP). The main immune aspects taken into account to
develop the algorithm were: maintenance of a specific memory set, selec-
tion and cloning of the most stimulated cells, death of non-stimulated cells,
An Overview of Artificial Immune Systems 81

affinity maturation and re-selection of the c10nes proportionally to their an-


tigenic affinity and generation and maintenance of diversity. The perform-
ance of their algorithm was compared with a GA for multi-modal optimi-
sation, and the author' s c1aim their algorithm was capable of detecting a
high number of sub-optimal solutions, inc1uding the global optimum of the
function being optimised. This work was further extended with the use of
the immune network metaphor for multi-modal optimisation in [36].

5.6 Scheduling

Creating optimal schedules in a constantly changing environment is not easy.


Work by Mori et al. [86], Chun et al. [87] and Mori et al. [84] proposes and de-
velops an imrnune algorithm that can create adaptive scheduling system based on
the metaphors of somatie hypermutation and the imrnune network theory. Work
by Mori et al.[84] builds on the above by addressing the issue of batch sizes and
combinations of sequence orders, which optimi se objective functions. In these
works, antigens are considered as input data Of disturbances in the optimisation
problem, and antibodies are considered as possible schedules. Proliferation of the
antibodies is controlled via an imrnune network metaphor where stimulation and
suppression are modelled in the algorithm. This assists in the control of antibody
(or new solution) production. The T-cell effect in this algorithm is ignored. The
authors daim that their algorithm is an effective optimisation algorithm for sched-
uling and has been shown to be good at finding optimal schedules. The authors in-
dicated that further work could be undertaken in applying this algorithm to a dy-
namically changing environment. This work was undertaken in Hart et al. [88] and
more recently in Hart & Ross [89].
Hart et al. [88] propose a system that can create a diverse set of schedules, but
not necessarily an optimal solution for the scheduling problem that can be easily
adapted should the situation change. The authors consider antibodies as a single
schedule and antigens to be possible changes to the schedule. Their system pro-
duces a set of antibodies (schedules) that can cover the whole range of possible
changes in the antigen set. Using these metaphors, and that of gene libraries to
create new antibodies, the authors have shown that they can create a set of sched-
ules, using aGA, from an initial random state of possible changes. Their system
can then successfully retrieve schedules corresponding to antigens existing in that
set, and also new antigens (or changes in situations) previously unseen. In a later
work, Hart and Ross [90] proposed a scheduling application of an artificial im-
mune system, called PRAIS (Pattern Recognising Artificial Imrnune System). In
their system, sudden changes in the scheduling environment required the rapid
production of new schedules. Their model operated in two phases. A first phase
comprises the imrnune system analogy, in conjunction with a genetic algorithm, in
order to detect common patterns amongst scheduling sequences frequently used by
a factory. In phase II, some of the combinatorial features of the natural imrnune
system were modelled to use the detected pattems to produce new schedules, ei-
ther from scratch or by starting from a partially completed schedule.
82 lTimmis et al.

5.7 Computer Security

The problem of protecting computers from viruses, unauthorised users, etc. consti-
tutes a rich field of research for artificial immune systems. The existence of a
natural immune system to fight against biological microorganisms like viruses and
bacteria is probably the most appealing source of inspiration for the development
of an artificial immune system to combat computer viruses and network intruders.

5.7.1 Network Security


The role of the immune system may be considered analogous to that of computer
security systems [91]. Whilst there are many differences between living organisms
and computer systems, researchers believe that the similarities are compelling and
could point the way to improved computer security. Long-term research projects
have been established in order to build a computer immune system [6, 92, 93 and
94] which could augment a simple computer security system with more advanced
and novel features. A good overview of the current work in this field is presented
by Somayaji et al. [95], where an attempt is made to draw together various pieces
of research in the field in order to derive some basic principles of computer im-
mune systems.
There are a number of approaches to implementing a computer security system.
Host based intrusion detection methods [96, 97] construct a database that cata-
logues the normal behaviour pattern of a piece of software that is specific to a par-
ticular machine, software version etc. Construction of such a database would en-
able the program's behaviour to be monitored.
In order to build up a pattern of normal behaviour for a particular database of
software, system calls made by the software are monitored and recorded over
time. As this record builds up, the database may be monitored for any system caUs
not found in the normal behaviour database, which are then flagged. The authors
argue that, while simplistic, this approach is not computationally expensive and
can be easily used in real time. It also has the advantage of being platform and
software independent.
An alternative method is the network based intrusion detection approach. This
tackles the issue of protecting networks of computers rather than an individual
computer. This is achieved in a similar way in monitoring network services, traffic
and user behaviour and attempts to detect misuse or intrusion by observing depar-
tures from normal behaviour. Work by both Hofmeyr and Forrest [97, 98] and
Kim and Bentley [99] lay foundations for a possible architecture and general re-
quirements for a network based intrusion detection system based on immune sys-
tem metaphors. Kim and Bentley [100] propose a network intrusion detection al-
gorithm based on metaphors presented in the previous paper. The algorithm is
based on the negative selection algorithm, first proposed by Forrest et al. [6].
Negative selection in the immune system is the immune system's ability to elimi-
nate harmful antibodies while not attacking the self of the immune system
(Sec.3.7). The algorithm in [6] consists of three phases: defining self, generating
detectors and monitoring the occurrence of anomalies. In this paper, it was applied
to the detection of computer viruses.
An Overview of Artificial Immune Systems 83

Recently, Dasgupta [91, 101] proposed an agent-based system for intru-


sion/anomaly detection and response in networked computers. In his approach, the
immunity-based agents roamed around the nodes and routers monitoring the situa-
tion of the network. The most appealing properties of this system were mobility,
adaptability and collaboration. The immune agents were able to interact freely and
dynamically with the environment and each other.

5.7.2 Virus Detection


Much interest has been shown in applying immune system metaphors to virus de-
tection. The first work done on this Forrest et al, [6,94] developed a simple algo-
rithm using the negative selection metaphor to detect potential viruses in computer
systems. This work was concerned with distinguishing normal computer resources
and behaviour from abnormal. A different approach is taken by Kephart and his
coworkers[102-104]. Their initial approach is to use the metaphor of the innate
immune system. This resides on the user's PC and applies virus-checking heuris-
tics to .COM and .EXE files. If an unknown virus is detected, a sample is captured
that contains information about the virus and is sent to a central processing system
for further examination. This is analogous to how the innate immune system
works, as the first line of defence. In the central processing service, the virus is
encouraged, or baited, to produce itself in a controlled environment. This allows
examination of the virus and extraction of its signature. An antidote can then be
constructed, which may be sent out to the infected PC and the virus removed.
The signature extraction mechanism is based on immune system metaphors,
such as clonal selection, by producing large numbers of possible code signatures
in order to detect the virus code signature. This is achieved by generating large
numbers of random signatures and checking each one of these signatures against
the potential virus. A positive match indicates that a virus has been detected.
(Marmelstein el al. [105] proposed an alternative multilayer approach which at-
tempts to tackle the infection at varying levels of protection where ultimately if
the infection cannot be identified an evolutionary algorithm is applied to create al-
ternative decoy programs to trap the virus. This was extended by Lamont et al.
[106], then Harmer & Lamont, [107] as a distributed architecture for a computer
virus system based on immune system principles.

6 Summary

Using the immune system as inspiration has proved very useful when trying to ad-
dress many computational problems. The immune system is a remarkable leaming
system. Through the use of B-cells and T-cells the immune system can launch an
attack against invading antigens and remove them from the system. This is
achieved through a process of B-cell stimulation followed by cloning and muta-
tion of new antibodies. This diversity that is generated allows the immune system
to be adaptive to new, slightly different infections. The immune system is able to
retain information about antigens; so that next time the body is infected a quicker,
84 J.Timmis et al.

secondary immune response can be triggered to eliminate the infection. A number


of theories exist about how the immune system retains this information, in a type
of memory: the most popular being the clonal selection theory, and the idea of
memory cells, and the alternative immune network theory, with the idiotypic in-
teractions on antibodies.
From observing this natural system, researchers have identified many interest-
ing processes and functions within the immune system that can provide a useful
metaphor for computation. The review of the field of artificial immune systems
(AIS) has revealed many and varied applications of immune metaphors. The pro-
posed framework for AIS was outlined, with the main ideas being that it is possi-
bIe to think of AIS in terms of a layered framework that consists of representa-
tions, affinity measures and immune algorithms. The field of machine leaming
was the examined. Early work by Cooke and Hunt [38] spawned a great deal of
research, which led to a generic unsupervised leaming algorithm proposed by
Timmis and Neal, [50], which ultimately forms part of the proposed framework
for AIS. Other approaches to leaming have also been adopted by de Castro and
von Zuben [35] to create a clustering algorithm. Work by Rart and Ross [57] pro-
poses a modified immune algorithm capable of clustering moving data and is
adaptable to clustering large volumes of data. Carter, [62] proposed the use of
immunological metaphors for supervised machine leaming, that had advantages
over other supervised methods in that the results were transparent. Attempting to
create collective behaviour in robots has also been a major field of study. By using
ideas from the immune network theory, work by Mitsumoto el al. [68] and subse-
quent works built a small system of self-organising and autonomous robots that
could be used for simple collection and navigational tasks. Some of the pioneering
work in AIS was done in the field of fault diagnosis, primary following work from
Ishida, [5]. This led to the development of the active diagnosis field [76] and more
recently the idea of Immunotronics and hardware tolerance[79]. Attention has also
been paid to using immune metaphors for solving optimisation problems [81], by
augmenting genetic algorithms. More recent work by Hart and coworkers [59, 88-
90] tackled the difficult problem of producing an adaptive scheduling system.
Here work used on combining the use of genetic algorithms and immunological
metaphors. A significant field of research is that of computer security and virus
protection. From the early work by Forrest el al. [6] for computer security and by
Kephart, [102] a significant body of research has been generated and architectures
for various security and virus detection systems proposed [98, 106].

7 Comments on the Future for AIS

For each of the many contributions discussed in Sec. 5 it would be possible to talk
at length regarding possible extensions and improvements. Instead, we will keep
our discussion about future trends on AIS at a high level.
Although several papers have been discussed proposing the use AIS to solve
problems in many areas of research, few of those have attempted to present a for-
mal artificial immune system (e.g., [19, 43, 98]. In reviewing alI these works on
An Overview of Artificial Immune Systems 85

AIS, it becomes clear that the area is lacking the proposal and development of a
general framework in which to design artificial immune systems. Observing other
comparable computational intelligence (or soft computing) paradigms, such as ar-
tificial neural networks, evolutionary computation and fuzzy systems, it is clear
that there is the presence of well-described sets of components and/or mechanisms
with which to design such algorithms. To this end, a framework has been pro-
posed [1], although much work remains to be done on this framework in terms of
formalisation from a mathematical viewpoint and augmentation in terms of new
shapes spaces and development of new algorithms which have been inspired by
other areas of immunology, as yet unexplored by computer scientists and engi-
neers.
This leads to an interesting avenue of research. To date, the concentration has
mainly been on the more basic immunology, such as simple antibodies, clonal se-
lection and so forth. Recent work by Aickelin and Cayzer, [108] postulated the use
of the Danger Theory [109] for AIS. This is an interesting idea, sadly beyond the
scope of this chapter. However, it is quite possible the danger theory has much to
offer AIS in terms of a paradigm shift in thinking - as yet unexplored. This shift is
away from the idea that the immune system controls the response, typically
adopted in AIS, to the idea that the tissue initialises and controls the response: in
some way contextualises the response. This could provide a powerful metaphor,
especially in terms of data mining in dynamic environments, where the context of
what you might want to learn may change over time. Indeed, this debate was
opened further by Bersini, [7], who argued that the danger theory idea is nothing
more than a change of terminology from the idea of self/non-self to the idea that
something is dangerous or not dangerous. It will be interesting to observe if this
debate grows pace.
The debate also has to be had: is there a single immune algorithm? To date, this
has not been addressed and is stiH an open question. One observation that is made
regarding AIS is that there are so many different algorithms around, it is so com-
plicated and you never know which one to use. In answer to this, you may state
that either more rigour and analysis has to be applied to the current algorithms to
identify exactly their suitability for problems and therefore, what AIS will offer is
a very rich suit of effective and well-understood algorithms. Altematively, you
could pursue a single unified algorithm: but then would that either enhance or re-
strict the field of AIS? It may enhance it in the fact that there then exits a single
commonly understood algorithm - so people then know what you mean when you
say an immune algorithm, or it may re strict it in the sense that the immune system
is such a complex system, why try and limit that to one simple algorithm - why
not exploit the many complexities therein?
It would certainly seem that there are many challenges ahead. Immunology has
a great deal to teach people involved with computational problems: we have only
just scratched the surface. A greater interaction between biology and computer
science is needed if we are to fully exploit the richness of this marvellous biologi-
cal system.
86 J.Timmis et al.

References

De Castro, L.N and Timmis, J (2002). Artificial Immune Systems: A New Computa-
tional Intelligence Approach. Springer, Berlin, Heidelberg, New York. ISBN 1-
85233-594-7
2 Jerne, N. (1974). Towards a network theory of the immune system. Annals of Immu-
nology (Inst.Pasteur). 12SC. pp. 373-389.
3 Perelson, A. S. (1989). Immune Network Theory, Imm. Rev., 110, pp. 5-36.
4 Bersini, H and Varela, F. (1990). Hints for adaptive problem solving gleaned from
immune networks. Parallel Problem Solving from Nature, 1" Workshop. PPSW1.
Dortmund, Germany. Springer, Berlin, Heidelberg, New York, pp. 343-354.
5 Ishida, Y. (1990). Fully Distributed Diagnosis by PDP Leaming Algorithm: Towards
Immune Network PDP Model. Proc. of the IEEE International Joint Conference on
Neural Networks. San Diego, USA, pp. 777-782.
6 Forrest, S, Perelson, A, Allen, L and Cherukuri, R (1994). Self-Nonself Discrimina-
tion in a Computer. Proc. of IEEE Symposium on Research in Security and Privacy.
Oakland, USA, pp. 202-212.
7 Bersini, H. (2002). ATribute to ..... In proceedings of 1" International Conference on
Artificial Immune Systems (ICARIS). Timmis J. and Bentley P. (Eds.) pp. 107-112.
8 De Castro, L.N and Timmis, J. (2003). Artificial Immune Systems as a Novel Soft
Computing Paradigm. Soft Computing.
9 Dasgupta, D (1998b). An overview of artificial immune systems. Artificial Immune
Systems and Their Applications. pp. 3-19. Springer, Berlin, Heidelberg, New York
10 Kepler, T and Perelson, A. (1993). Somatic Hypermutation in B-cells : An Optimal
Control Treatment. Journal of Theoretical Biology. 164. pp. 37-64.
11 Berek, C. & Ziegner, M. (1993). The Maturation of the Immune Response, Immunol.
Today, 14(8), pp. 400-402.
12 Varela, F, Coutinho, A, Dupire, B and Vaz, N. (1988). Cognitive Networks : Immune
and Neural and Otherwise. TheoreticalImmunology: Part Two, SFI Studies in the Sci-
ences ofComplexity, 2, pp.359-371
13 Janeway, C. (1993). Life, Death and the Immune System. Scientific American Special
Issue. How the immune system recognises invaders, pp. 27-36.
14 Roitt, I. (1997). Essential Immunology: 9''' Edition. Chap. Specific Acquired Immu-
nity, pp. 22-39. Pub. Blackwell Science, Oxford
15 Bumet, F. M. (1959). The Clonal Selection Theory of Acquired lmmunity, Cambridge
University Press., Cambridge.
16 Smith, D. J., S. Forrest & A. S. Perelson (1998). Immunological Memory is Associa-
tive. In Artificiallmmune Systems and their Applications. Ed. D. Dasgupta. Springer,
Berlin Heidelberg New York
17 Tizzard, 1. (1988a). lmmunology: An lntroduction. 2m1 edition. Chap. The response of
B-cells to Antigen. Pp. 199-223. Saunders College.
18 Timmis, J. (2000). Artificial Immune Systems: A novel data analysis technique in-
spired by the immune network theory. Ph.D. Thesis. University of Wales, Aberyst-
wyth.2000.
19 De Castro, L. N. & Von Zuben, F. J. (1999). Artificial Immune Systems: Part 1- Ba-
sic Theory and Applications, Technical Report - RT DCA 01/99, p. 95
An Overview of Artificial Immune Systems 87

20 Hunt, J. E. & Cooke, D. E. (1996). Learning Using an Artificial Immune System,


Journal of Network and Computer Applications, 19, pp. 189-212.
21 Tizzard,I. (1988b). lmmunology: An lntroduction. 2nd edition. Chap. The response of
T-cell to Antigen. Pp. 224-260. Pub. Saunders College.
22 Tew, J & Mandel, T. (1979). Prolonged antigen half-Iife in the Iymphoid foJlicIes of
antigen-specificaJly immunised mice. lmmunology, 37, pp. 69-76.
23 Tew, J, Phipps, P & Mandel, T. (1980). The maintenance and regulation of the hu-
moral immune response. Persisting antigen and the role of follicular antigen-binding
dendritic cells. lmmunological Review, 53, pp. 175-211.
24 Ada, G. L. & Nossal, G. J. V. (1987). The Clonal Selection Theory, Scientific Ameri-
can, 257(2), pp. 50-57.
25 Matzinger, P. (1994). Immunological Memories Are Made of This? Nature, 369, pp.
605-606.
26 Coutinho, A. (1989). Beyond Clonal Selection and Network, lmmunol. Rev., 110, pp.
63-87.
27 Farmer, J, Packard, N and Perelson, A. (1986). The Immune System, Adaptation and
Machine Learning. Physica D. 22, pp. 187-204.
28 Carneiro, J & Stewart, 1. (1995). Self and nonself revisited: Lessons from modeIling
the immune network. Third European Conference on Artificial Life, Granada, Spain.
pp. 405-420.
29 Coutinho, A. 1980. The self non-self discrimination and the nature and acquisition of
the antibody repertoire. Annals of lmmunology. (Inst. Past.) 131D.
30 Bersini, Hand Varela, F. (1994). The immune learning mechanisms : Reinforcement
and recruitment and their appIications. Computing and Biological Metaphors. Pages
166- I 92. Chapman HalI.
31 Bersini, H. (1991). Immune Network and Adaptive Control, Proc. ofthe First Euro-
pean Conference on Artificial Life, MIT Press, pp. 217-226.
32 Perelson, A. S., Mirmirani, M. & Oster, G. F. (1978). Optimal Strategies in Immunol-
ogy II. B Memory CelI Production, J. Math. Biol., 5, pp. 213-256.
33 Perelson, A. S. & Weisbuch, G. (1997). Immunology for Physicists, Rev. of Modern
Physics, 69(4), pp. 1219-1267.
34 Zinkemagel, R. M. & Kelly, J. (1997). How Antigen Influences Immunity, The lm-
munologist, 4/5, pp. 114-120.
35 De Castro, L. N., & Von Zuben, F. J., (2000b). An Evolutionary Immune Network for
Data Clustering, Proc. ofthe IEEE SBRN, pp. 84-89.
36 De Castro, L.N and Timmis, J (2002a). An Artificial Immune Network for Multimo-
dai Optimisation. In Proceedings of the Congress on Evolutionary Computation. Part
of the 2002 IEEE World Congress on Computational lntelligence., pp. 699-704,
Honolulu, Hawaii, USA. IEEE.
37 De Castro, L. N. & Von Zuben, F. J. (2000a). Artificial Immune Systems: Part 11- A
Survey of Applications, Technical Report - RT DCA 02/00, p. 65.
38 Cooke, D and Hunt, J. (1995). Recognising Promoter Sequences Using an Artificial
Immune System. Proc. of lntelligent Systems in Molecular Biology. AAAI Press, pp.
89-97.
39 Quinlan, J. (1993) C4.5: Programs for machine learning. Morgan Kaufmann.
40 Kolodner,1. (1993). Case Based Reasoning. Pub. Morgan Kaufmann.
41 Hunt, J, Cooke, D and Hoistein, H. (1995). Case Memory and Retrieval Based on the
Immune System. Case-Based Reasoning Research and Development, Lecture Notes in
Artificiallntelligence. 1010. pp. 205-216
42 Hunt, J & FeIlows, A (1996). Introducing an Immune Response into a CBR system
for Data Mining. BCS ESG'96 Conference and published as Research and Develop-
ment in Expert Systems XlI/o pp. 35-42. Springer, Berlin, Heidelberg, New York.
88 J.Timmis et al.

43 Hunt, J, King, C and Cooke, D. (1996). Immunising Against Fraud. Proc. Knowledge
Discovery and Data Mining. IEE Colloquiurn. IEE., pp. 3845.
44 Hunt, J, Timmis, J, Cooke, D, Neal, M and King, C. (1998). nSYs: Development of
an Artificial Immune System for real world applications. In Artificial Immune Systems
and theory Applications. Ed. D. Dasgupta. pp. 157-186.
45 Neal, M, Hunt, J and Timmis, J. (1998). Augmenting an artificial immune network.
Proc. ofthe IEEE SMC, San Diego, Calif., pp. 3821-3826.
46 Timmis, J, Neal, M and Hunt, J. (2000). An Artificial Immune System for Data
Analysis. Biosystems. 55(1/3), pp. 143-150
47 Fisher, R (1936). The use of multiple measurements in taxonomic problems. Annual
Eugenics. 7, ll. pp. 179-188
48 Kohonen, T. (1997a). Selj-Organising Maps. 200 Edition.
49 Timmis, J, Neal, M and Hunt, J. (1999). Data Analysis with Artificial Immune Sys-
tems and Cluster Analysis and Kohonen Networks: Some Comparisons. Proceedings
ofthe IEEE SMC, Tokyo, Japan. pp. 922-927.
50 Timmis, J and Neal, M.(2oo1) A Resource Limited Artificial Immune System for
Data Analysis. Knowledge Based Systems, 14(34):121-130, June 2001.
51 Timmis, J (2001). aiVIS: Artificial Immune Network Visualisation. EuroGraphics
UK 2001 Conference Proceedings, pp. 61-69, University College London, April2oo1.
52 Knight, T and Timmis, J. (2001). In N Cercone, T Lin, and Xindon Wu, editors, IEEE
International Conference on Data Mining, pp. 297-304, San Jose, Calif. December
2001. IEEE, New York
53 Neal, M. (2002). An Artificial Immune System for Continuous Analysis of Time-
Varying Data. In 1" International Conference on Artificial Immune Systems (ICARIS),
pages 75-86, Canterbury, UK.
54 Knight, T and Timmis, J. (2002). A Multi-Layered Immune Inspired Approach to
Data Mining. Recent Advances in Soft Computing, Nottingham, UK. 2002
55 De Castro, L. N., & Von Zuben, F. J., (2001). The Construction of a Boolean Com-
petitive Neural Network Using Ideas From Immunology, submitted.
56 Slavov, V & Nikoleav, N (1998). Immune network dynamics for inductive problem
solving. Lecture Notes in Computer Science, 1498, pp. 712-721. Springer, Berlin,
Heidelberg, New York.
57 Hart, E & Ross, P. (2001). Clustering Moving Data with a Modified Immune Algo-
rithm. EvoWorkshops 2001 - Real World Applications of Evolutionary Computing.
58 Hart, E & Ross, P. (2002a). Exploiting the Analogy between Immunology and Sparse
Distributed Memories. Proc. of 1CARIS-2002, pp. 49-58.
59 Hart, E. (2002b) Immunology as a Metaphor for Computationaiinformation Process-
ing: Fact of Fiction ? PhD thesis. University of Edinburgh.
60 Kanerva, P. (1998) Sparse Distributed Memory. MIT Press, Cambridge, Mass.
61 Potter, M.A. & De Jong, K.A (2000) Cooperative coevolution: An architecture for
evolving co adapted subcomponents. Evolutionary Computation, 8(1):1--29.
62 Carter, J.H. (2000). The Immune System as a Model for Pattern Recognition and
Classification. Journal of the American Medical Informatics Assocation.711. pp. 28-
41.
63 Wettschereck, D. Aha, D.W, and Mohri, T. 1997. A review and empirical evaluation
of feature weighting methods for a class of lazy leaming algorithms. Artificial Intelli-
gence Review. 11:273-314.
64 Gennari, J.H. Langley, P and Fisher, D. (1989). Models of information concept forma-
tion. Artificial1ntelligence; 40: 11-61.
65 Watkins, A. (2001). A resource limited artificial immune classifier. MS Thesis. Mis-
sissippi State University. Miss.
An Overview of Artificiallmmune Systems 89

66 De Castro, L. N. & Von Zuben, F. J. (2oo0c). The Clonal Selection Algorithm with
Engineering Applications, Proc. of GECCO '00 - Workshop Proceedings, pp. 36-37.
67 Watkins, A and Timmis, J. (2002). Artificiallmmune Recognition Systems (AIRS):
Revisions and Refinements. In Proceedings of the 1" International Conference on Ar-
tificiallmmune Systems. pages 173-181, University of Kent at Canterbury, September.
68 Mitsumoto, N, Fukuda, T and Idogaki, T. (1996). Self-Organising Multiple Robotic
System. Proceedings of IEEE International Conference on Robotics and Automation.
pp. 1614-1619. Minneapolis, USA. IEEE.
69 Mitsumoto, N, Fukuda, T, Arai, F & Ishihara, H (1997). Control of distributed
autonomous robotic system based on the biologically inspired immunological archi-
tecture. Proceedings of IEEE International Conference on Robotics and Automation.
pp. 3551-3556. Albuquerque, N.M .. IEEE, New York
70 Lee, Dong-Wook and Sim, Kwee-Bo. (1997). Artificial immune network based co-
operative control in collective autonomous mobile robots. Proc. of IEEE International
Workshop on Robot and Human Communication. Sendai, Japan. IEEE, New York,
pp.58-63.
71 Watanabe, Y, Ishiguro and Uchikawa, Y. (1998). Decentralised behaviour arbitration
mechanism for autonomous mobile robots using immune network. In Artificial Im-
mune Systems and their applications. Ed. D. Dasgupta. pp. 187-209. Springer, Berlin
Heidelberg New York
72 Kondo, T, Ishiguro, A, Watanabe, Y and Uchikawa, Y. (1998). Evolutionary con-
struction of an immune network based behaviour arbitration mechanism for autono-
mous mobile robots. Electrical Engineering in Japan. 123/3. pp. 1-10
73 Kayama, M, Sugita, Y, Morooka, Y & Fukuodka, S. (1995). Distributed diagnosis
system combining the immune network and learning vector Quantization, pp. 1531-
1536 of Proc. IEEE 21" International Conference on Industrial Electronics and Con-
trol and Instrumentation, Orlando, USA.79.
74 Kohonen, T. (1997b). Selj-Organising Maps. 2"d Edition. Chap. Leaming Vector
Quantization. pp. 203-217. Springer, Berlin Heidelberg New York
75 Ishida, Y & Mizessyn, F. (1992). Learning algorithms on immune network model:
application to sensor diagnosis. Proc. International Joint Conference on Neural Net-
works, Beijing, China, pp. 33-38.
76 Ishida, Y (1996). Distributed and autonomous sensing based on immune network.
Proc. ofArtificial Life and Robotics. Beppu. AAAI Press, pp. 214-217.
77 Ishida, Y & Tokimasa, T. (1996). Diagnosis by a dynamic network inspired by im-
mune network. Proc. World Congress of Neural Networks, San Diego, Calif. pp. 508-
511.
78 Ishida, Y. (1997). Active Diagnosis by Self-Organisation : An approach by the im-
mune network metaphor. Proceedings ofthe International Joint Conference on Artifi-
cial Intelligence. pp. 1084-1089. Nagoya, Japan.
79 Bradly, D. W. & Tyrrell, A. M. (2oo0a), Immunotronics: harware Fault Tolerance In-
spired by the Immune System, Lecture Notes in Computer Science, 1801, ppll-20.
80 Timmis, J, de Lemos, R, Ayara, M and Duncan R. (2002) Towards Immune Inspired
Fault Tolerance in Embedded Systems. To appear in the Proceedings of International
Conference on Neural Information Processing. Singapore. November 2002.
81 Hajela, P., Yoo, J. & Lee, J. (1997). GA Based Simulation of Immune Networks -
Applications in Structural Optimization, Journal of Engineering Optimization.
90 lTimmis et al.

82 Toma, N, Endo, S & Yamada, K (1999). Immune algorithm with immune network
and MHC for adaptive problem solving. Proc. lEEE SMC. Tokyo, Japan, IV, pp. 271-
276.
83 Mori, K, Tsukiyama, M and Fukuda, T. (1996). Multi-optimisation by immune algo-
rithm with diversity and leaming. Proc. ofthe lEEE SMC, pp. 118-123.
84 Mori, K, Tsukiyama, M and Fukuda, T (1998). Application of an immune algorithm
to multi-optimisation problems. Electrical Engineering in Japan. 122/2. pp. 30-37
85 Fukuda, T, Mori, K and Tsukiyama, M. (1998). Parallel Search for Multi-Modal
Function Optimisation with Diversity and Leaming of Immune Algorithm. Artificial
lmmune Systems and Their Applications. pp. 210-220. Springer, Berlin Heidelberg
New York
86 Mori, K, Tsukiyama, M and Fukuda, T. (1994). Immune Algorithm and Its Applica-
tion to Factory Load Dispatching Planning. pp. 1343-1346 of Proc. Japan-USA Sym-
posium on Flexible Automation.
87 Chun, J, Kim, M & Jun, H. (1997). Shape Optimisation of Electromagnetic Devices
Using Immune Algorithms. lEEE Transactions on Magnetics, 33,(2).
88 Hart, E. Ross, P. and Nelson, T (1998). Producing robust schedules via an artificial
immune system. Prac. oflEEE CEC'98, pp. 464-469. IEEE.
89 Hart, E. & Ross, P. (1999a). The Evolution and Analysis of a Potential Antibody Li-
brary for Use in Job-Shop Scheduling, In New ldeas in Optimisation, D. Come, M.
Dorigo & F. Glover (Eds.), McGraw HiU, London, pp. 185-202.
90 Hart, E. & Ross, P. (1999b). An Immune System Approach to Scheduling in Chang-
ing Environments", Proc. ofGECCO'99, pp. 1559-1566.
91 Dasgupta, D (1999). Immunity based intrusion detection systems: A general frarne-
work. Proceedings of the 22nd National lnformation Systems Security Conference
(N1SSC). Pp. 147-159
92 D'haeseleer, P, Forrest, S and Helman, P (1996). An Immunological Approach To
Change Detection: Algorithm and Analysis and Implications. Proceedings ofthe 1996
lEEE Symposium on Computer Security and Privacy. pp. 110-119
93 Forrest, S, Hofmeyr & Somayaji, A & Longstaff, T. (1996). A sense of self for UNIX
processes. Proc. lEEE Symposium on Research in Security and Privacy. Oakland,
USA, pp. 120-128.
94 Forrest, S, Hofmeyr, S and Somayaji, A (1997). Computer Immunology. Communica-
tions ofthe ACM. 40/10. pp. 88-96
95 Somayaji, A., Hofmeyr, S. A. & Forrest, S. (1997), Principles of a Computer Immune
System, Proc. ofthe new Security Paradigms Workshop, pp. 75-81.
96 Hofmeyr, S, Forrest, S & Somayaji, A. (1998). Intrusion detection using a sequence of
system calls. Joumal of Computer Security, 6, pp. 151-180.
97 Hofmeyr, S and Forrest, S (1999). Immunity by Design: An artificial immune system.
Proc. ofGECCO'99, Pub. Morgan-Kaufman. pp. 1289-1296
98 Hofmeyr, S.A. and Forrest, S. (2000). Architecture for an Artificial Immune System.
Evolutionary Computation 7(1):45-68.
99 Kim, J and Bentley, P. (1998). The human immune system and network intrusion de-
tection. Proc. of 7th European Congress on lntelligent Techniques - Soft Computing.
Aachan and Germany
100 Kim, J. & Bentley, P. (1999), Negative Selection and Niching by an Artificial Irn-
mune System for Network Intrusion Detection, Proc. of GECCO'99, pp. 149-158.
An Overview of Artificial Immune Systems 91

101 Dasgupta, D. (2000). An Immune Agent Architecture for Intrusion Detection, Proc. of
GECCO'OO, Workshop on Artificial Immune Systems and Their Applications, pp ..
102 Kephart, J. O. (1994). A Biologically Inspired Immune System for Computers, R. A.
Brooks & P. Maes (Eds.), Artificial Life IV Proceedings of the Fourth International
Workshop on the Synthesis and Simulation of Living Systems, MIT Press, Cambridge,
Mass., pp. 130-139.
103 Kephart, J. O., Sorkin, G. B. & Swimmer, M. (1997), An Immune System for Cyber-
space, Prac. of the IEEE SMC'97, pp. 879-884.
104 Kephart, J. Sorkin, B. Swimmer, M and White, S. (1998). Blueprint for a computer
immune system. In Artificial Immune Systems and their Applications. Ed. D. Das-
gupta. pp. 242-260. Springer, Berlin Heidelberg New York
105 Marmelstein, M, Veldhuizen & Lamont, G. (1998). A Distributed Architecture for an
Adaptive Computer Virus System. Proc. of the IEEE SMC, San Diego, Calif.m. pp.
3838-3843.
106 Lamont, G. B., Marmelstein, R. E. & Van Veldhuizen D. A. (1999), A Distributed
Architecture for a Self-Adaptive Computer Virus Immune System, New Ideas in Op-
timisation, D. Corne, M. Dorigo & F. Glover (Eds.), McGraw Hill, London, pp. 167-
183.
107 Harmer, P.K. and lamont, G.B. (2000). An Agent Based Architecture for a Computer
Virus Immune System. In Proceedings of Artificial Immune Systems Workshops. pp.
45-46. GECCO 2000, Las Vegas, USA.
108 Acklien, U and Cayzer, S. (2002). The Danger Theory and its Application to Artificial
Immune Systems. Proceedings of the r International Conference on Artificial Im-
mune Systems (ICARIS). pp. 141-148. Canterbury, UK.
109 Matzinger, (. (1994a). Tolerance, Danger and the Extended Family. Annual Review of
Immunology. 12:991-1045.
IlO Warrender, C, Forrest, S & Pearhmutter, B. (1999). Detecting intrusions using system
calIs: Alternative data models. Proc. of Symposium on Security and privacy. IEEE,
New York, pp. 133-145.
Embryonics and Immunotronics:
Biologically Inspired Computer Science Systems
A. Tyrrell

Bio-Inspired Architectures Laboratory, Dept. of Electronics,


University of York, York YOlO 5DD, UK
amt@ohm.york.ac.uk

Abstract. This first part of this article details and expands the work on embryon-
ics, a recently proposed fault-tolerant cellular architecture with reconfiguration
properties inspired by the ontogenetic development of multicellular systems. The
design of a selector-based embryonic cell and its applications are presented. The
second part of this article describes a novel approach to hardware fault tolerance
that takes inspiration from the human immune system as a method of fault detec-
tion. The human immune system is a remarkable system of interacting cells and
organs that protects the body from invasion and maintain reliable operation even
in the presence of invading bacteria Of viruses. Here we seek to address the field
of electronic hardware fault tolerance from an immunological perspective with the
aim of showing how novel methods based upon the operation of the immune sys-
tem can both complement and create new approaches to the development of reli-
able hardware systems. The final part of the article suggests a combined architec-
ture that would have the characteristics and advantages of both Embryonics and
immunotronics.

1 Introduction

The traditional methodology of designing electronic systems is to consider an ini-


tial specification, then to sub-divide this into smaller and smaller elements until
they are of a complexity that a human designer can manage: often called top-down
design. In this process the sub-divided items (functional elements which form an
ensemble to meet the complete system function) are rather specific in their func-
tional capability (e.g. an AND gate function, a square root function, an FIR low
pass filter function). While it might be argued that this type of design suits human
designers, it does limit the ability of a system to cope with unpredictable faults.
Biological systems are, in general, more complex than human designed systems,
and usually they are more reliable (how often have you felt your brain telling you
'that this program has made an illegal operation?). How does biology cope with
these problems, and can we leam from it?
A human being consists of approximately 60 trillion (60xlO '2 ) cells. At each in-
stant, in each of these 60 trillion cells, the genome, a ribbon of 2 billion characters,
is decoded to produce the proteins needed for the survival of the organism. This
94 A. Tyrrell

genome contains the ensemble of the genetic inheritance of the individual and, at
the same time, the instructions for both the construction and the operation of the
organism. The parallel execution of 60 trillion genomes in as many cells occurs
ceaselessly from the conception to the death of the individual. This process is re-
markable for its complexity and its precision. Moreover, it relies on completely
discrete information: the structure of DNA (the chemi cal substrate of the genome)
is a sequence offour bases, usually designated with the letters A (adenine), C (cy-
tosine), G (guanine), and T (thymine).
The analogy between multicellular organisms and multiprocessor computers is
therefore not too far fetched, and well worth investigating, particularly when con-
sidering that nature has achieved levels of complexity that far surpass any man-
made computing system. The aspect of biological organisms on which this chapter
is centred is their phenomenal robustness: in the trillions of cells that make up a
human being, faults are rare, and in the majority of cases, successfully detected
and repaired. This level of reliability is remarkable, and relies on very complex
mechanisms that are difficult to translate directly into silicon (e.g. biology is typi-
cally dealing with 3D structures, it also has almost infinite resources at hand).
Nevertheless, it will be seen that, by drawing inspiration from the overall structure
of biologic al organisms, architectures can be developed that are inherent1y fault
tolerant.
The embryonics project (for embryonic electronics) is inspired by the basic
processes of molecular biology and by the embryonic development of living be-
ings [1]. By adopting certain features of cellular organisation, and by transposing
them to the two-dimensional world of integrated circuits in silicon, it wiIl be
shown that properties unique to the living world, such as self-replication and self-
repair, can also be applied to artificial objects (integrated circuits). Self-repair al-
lows partial reconstruction in case of a minor fault, while self-replication allows
complete reconstruction of the original device in cases where a major fault occurs.
These two properties are particularly desirable for complex artificial systems in
situations that require improved reliability, such as [2]:
• Applications which require very high levels of reliability, such as avionics or
medical electronics
• Applications designed for hostile environments, such as space, where the in-
creased radiation levels reduce the reliability of components
• Applications which exploit the latest technological advances, and notably the
drastic device shrinking, low power supply levels, and increasing operating
speeds, which accompany the technological evolution to deeper sub-micron
levels and significantly reduce the noise margins and increase the soft-error
rates
To increase still further the potential reliability of these systems, inspiration has
also been taken from biologic al immune systems - immunotronics. The acquired
immune system in humans (and most vertebrates) has a mechanism for error de-
tection, which is simple, effective and adaptable. The second part of this chapter
will introduce immunotronic ideas and suggest ways they might be incorporated
into an embryonic architecture
Embryonics and lmmunotronics 95

2 An Overview of Embryonics

In any living being, every one of its constituent cells interprets the DNA strand al-
located in its nucleus to produce the proteins needed for the survival of the organ-
ism, independently of the particular function it performs. Which part or parts of
the DNA are interpreted will depend on the physicallocation of the cell with re-
spect to its neighbours.
The aim of Embryonics is to transport these basic properties to the two-dimen-
sional world of cellular arrays using specifically designed FPGAs as building
blocks. In any embryonic system, every one of its FPGA-based celIs interprets a
configuration register allocated in its memory, independently of the particular
logic function it performs. Which configuration register is interpreted will depend
on the coordinates of the cell determined by those of its neighbours. Embryonic
cellular arrays share the following properties with their biological counterparts [3-
5]:

2.1 Multicellular Organization

The artificial organism is formed by an array of programmable cells Fig. 1. The


function of each cell is defined by a configuration register called the gene of the
ceH. The same organism can contain multiple cells of the same kind, i.e., ceHs
with identical configuration registers.

2.2 Cellular Division

At start up all the celIs are identical, i.e. either they have their memory initialised
to a pre-defmed value or the content of the memory is of no relevance, Fig. 1. A
mother cell (the zygote), arbitrarily defined as having the coordinates 0,0, propa-
gates the genome to the neighbouring (daughter) cells to the north and east. The
process continues until alI the cells in the arrays have a copy of the genome. Each
cell will be different to the others because every one will execute one gene accord-
ing to its coordinates, Fig. 2.

2.3 Cellular differentiation

Each gene is part of the global program of the cell, the genome, and the execution
of a particular gene depends only on the position of the cell within the array. In
other words, eacb cell bas a copy of the genome (set or configuration registers al-
located in its memory), and extracts and executes the gene which configures it. In
this sense, each cell is universal, i.e. it can perform the function of any other celI,
given the proper set of coordinates.
96 A. Tyrrell

'/"III!
, , , r"

'1fII" '"
" li/III'

Fig. 1. Initial embryonic array consisting of 'uncommitted' functional units


Embryonics and lmmunotronics 97

11".'/ I"I'/r

Fig. 2. Embryonic array with ceH division and differentiation

3 The Organism's Features: Multicellular Organization,


Cellular Differentiation, and Cellular Division

The environment in which embryonic's quasi-biological development occurs is


imposed by the structure of electronic circuits, and consists of a finite (but arbi-
trarily large) two-dimensional surface of silicon, Fig. 1. This surface is divided
into rows and columns, whose intersections de fine the cells. In keeping with bio-
logical inspiration, the cells of the artificial organism will be implemented by very
small processors with an identic al physical structure (i.e., an identic al set of logic
operators and connections). The cells will execute an identic al program (the artifi-
cial genome) and only the state of a cell (i.e., the contents of its registers) can dif-
ferentiate it from its neighbours.
Our artificial ceH (Fig.3) is a very simple processing element, implemented us-
ing standard electronics (in this particular case this is implemented on a field pro-
grammable gate array, FPGA) and capable of executing a simple finite state ma-
chine. With this, hardware, implementation of embryonic ideas, the artificial
genome actually specifies a particular logic configuration for each and every cell.
98 A. TyrreH

Hence, rather than specifying a program, it specifies that a ceH should have the
functionality of a particular logic function and also the relevant connectivity of a
ceH. Another ceH, having different coordinates, wiII interpret a different part of
the genome and hence wiH have a different functionality and different connec-
tivity.

/ w
00
Memory Router
Configurati
Registers 10
II
12
20 110
21
22

Logic
block

Fig. 3. The basic architecture of an embryonic system

Each part of this basic ceH will be described in more detaillater in this chapter. This ar-
chitecture presents the foHowing advantages

• It is highly regular, which simplifies its implementation on silicon


• The function of the logic block can be changed without affecting the function
of other blocks.
• The simplicity of a block's architecture allows its implementation using built-in
self-test (BIST) logic to provide self-diagnosis without excessively increasing
the silicon area [6].
Digital data are transmitted from one ceH to its neighbours through a North-
East-West-South (NEWS) connection. The 1/0 router block allows the spread of
information over the complete array, controlled by one section of the correspond-
ing configuration register.

4 Architecture of the Cell

The foHowing sections describe in detail each of the constituent blocks of the em-
bryonic ceH.
Embryonics and Immunotronics 99

4.1 Memory

Each cell must have enough memory to contain a copy of the configuration regis-
ter of aII the cells in the array (or at least the celIs within the same column as it).
Therefore, every cell will be able to replace any other in case of failure . For our
cell a 256x20 memory was chosen. With this memory size one can build arrays of
up to 16x 16 celIs. While the size of applications might be limited by memory re-
sources, the partitioning of the cellular system, and limitation on the universality
of cells allow implementation to be achieved without any detriment to the sys-
tem's ability to reconfigure on fault.
On each cell, the outputs of the memory will only be used intemally to config-
ure the programmable logic, therefore, a serial-input parallel-output memory was
chosen to avoid the necessity of a 20-bit bus interconnecting cells. Figure 4 shows
the memory system of

Data In ------~ 20-bit serial-in Data Out


paraIlel-out
General cIock ----~

In19:0
8-bit
256x 20
counter Control of
8-bit Memory
cIk Logic 8lock
2-1 Addr7:0
MUX Outl9:0 and VO router
Addre

Fig. 4. 810ck diagram of the memory system located in each ceH one ceH

Setting-up an embryonic array implies two phases: one on which the coordi-
nates are caIculated and the genome is downloaded, and a second on which the ar-
ray performs the desired function. These are called the configuration and opera-
tional phases, respectively.
During configuration, memory addresses are taken from the 8-bit counter,
which is incremented every time a 20-bit string is loaded. When aII configuration
registers have been shifted in, the mode signal changes and the caIculated coordi-
nates select the appropriate configuration register for each ceH.
100 A. Tyrrell

4.2 Address Generator

This block calculates the coordinates of the ceH from the coordinates of either its
south or east neighbour. Figure 5 shows the basic architecture of this block. Re-
member it is the coordinates of a particular ceH that differentiates it from other
cells in the system and defines which part of the genome it should interpret to de-
fine its functionality.
X,Y
s

~~~~~ E
X,Y East X,Y
lihl liiJ tii] [ii] ~ E

[!] [TI] [I] [I]~ E


X,Y South
~ ~ ~~~ E
a) b)

Fig. S. a Address generator. b Address generation

To calculate its co-ordinates, each ceH receives the co-ordinates generated at its
east and south neighbours and increments either the row or column value of one of
them depending on the state of a selection signal. The resulting values are then
presented to the north and west neighbours to enable their positions in the array to
be calculated (Fig. 5a). Figure 5b illustrates the address generation process on a
4x4 array.
Once coordinates have been calculated and stored in a register, they can be
used to select the corresponding configuration register from the memory. In this
system, each coordinate is 4 bits wide. To construct an address, both X (row) and
Y (column) coordinates are appended, with X as the most significant nibble and Y
as the least significant one.

4.3 Logic Block

In this particular implementation of an embryonic array, the logic block performs


a 2-1 multiplexer function (hence it is able to implement any two-input logic func-
tion). Its inputs can be selected from eight possible sources. The output can be
registered and fed back so that the implementation of sequential logic is possible.
Figure 6 shows the architecture of this block.
In the circuit depicted in Figure 6 many input-output combinations can be
achieved by prograrnming the configuration bits (labels in bold). This selection
capability, in conjunction with the IlO router, allows the implementation and in-
terconnection of binary decision diagrams of any size, as long as the number of
cells in the array is sufficient.
Embryonics and Immunotronics 101

O_ AO
l_ AI
SIN _ Al
ErN _ A3
A4
WIN - AS
Q- z
SIBUS - A6
SOBUS - A7
OUT

REG-- --'
EIN
HN-..-~
WOUT

SIN
EOUT
EOBUS
EIBUS
L O - SO WlN
_ SI
Ll _ S2
L2 EBUS - - - - '

Fig. 6. Architecture of logic block

°
L3:0 and R3:0 select one out of eight possible inputs on their respective multi-
plexers. It is possible to select or 1 as the signal to be propagated in order to fa-
cilitate the implementation. The REG bit will determine if the output is combina-
tional or sequential. The selection input for the main multiplexer element (marked
with a star on Fig. 6), can also be selected from two of the signals controlled by
the IlO router. The value of EBUS determines whether EOBUS Of EIBUS will se-
lect the block's output. If the value of PRESET is 1, then the registered output is
set to 1. If it is 0, then the registered output will become the output of the main
multiplexer element. WV controls the fIow of information in the horizon-
taI/vertical direction. If WV is 1, then the south input SIN is propagated through
both EOUT and WOUT outputs. If it is 0, then EOUT and WOUT propagate the
inputs in WIN and EIN respectively.

4.4 Inputloutput Router

In a conventional cellular array the output generated by a particular ceH can only
be propagated to the nearest neighbours. In a embryonic array the IlO router pro-
vides additional paths aHowing information to propagate not only to the nearest
neighbours, but also to more distant ceHs. Figure 7 shows the mechanisms for in-
formation to be routed in this block.
102 A. Tyrrell

,/ ;=il
NO
N [i=il
[O
wi=il
WO
si=il
so

NOBU NIBUS ,/
I
I:
SI SO SO SO

I : I
SI SO
D CN DECE DECW DECS
Z3 Z2 ZI Z2 ZI Z2 ZI Z3 Z2 ZI
WIBU ;OBUS
NOUT

WOBU
IlO -IDU NIDUS
ElDUS
Router WIBUS
smus

SlBUS SOBU NOUT


"-"
']. '].11 ']. J. '}.

rrom logic ""\.


block " ~ US ~US ~ Us .sQB.ijI)
a) b)

Fig. 7. a IlO Router as an independent block. b Internal architecture

Close inspection of Fig. 7b iIIustrates the various paths any input could folIow.
This is achieved by using tristate buffers to connect four inputs to a single output
line. Tristate buffers are controlled using selection Iines generated by 2-4 decod-
ers. In Fig. 7b, labels in bold represent the selection bits stored in the configura-
tion registers. NOUT is the output coming from the corresponding functional
block.

4.5 Error Detection and Error Handling

The smartest reconfiguration strategy is no good unless there is fast and accurate er-
ror detection system that wilI signal the start of the reconfiguration process. When
a fault is self-detected by any of the celIs the process of reconfiguration begins.
References on diagnostic techniques and BIST logic can be found in [7] and [8].
Error detection in this particular case is achieved with the use of duplicated
hardware and a simple compare and validate procedure. That is, the major func-
tional blocks are duplicated and fed the same inputs. The output from any func-
tional block (e.g. memory, address generation, and logic block) and their dupli-
cates are compared, if the same operation continues as normal, if a difference is
identified an error is signaled and a reconfiguration 'cycle' is initiated.
Any ceH detecting a self-failure issues a non-OK signal to aH its neighbours,
and they propagate this signal along the row and/or column of the affected ceH
(whether it is propagated to row or column neighbours will depend on the recon-
figuration strategy: if we are to reconfigure by column then the whole row must be
made transparent, by row then the column mist become transparent). When a cell
receives a non-OK signal from one of its neighbours, it becomes transparent for
the address-calculation process, i.e. the· ceH transmits the addresses received from
its neighbours without modification. Unaffected cells recalculate their coordinates
and consequently select a new configuration register. Figure 8 presents an embry-
onic array when aII its cells are fault-free (a) and when one of the cells has failed
(b). Notice that when reconfiguration occurs, coordinates are changed folIowing a
diagonal direction.
Embryonics and Immunotronics 103

a) b)

Fig. 8. Embryonic array a before and b after ceH 1,1 failed

5 Examples

In order to explore the embryonic thesis, a very simple application is developed.


The example uses the basic embryonic ceH to construct a 4x4 embryonic array
which is configured to perform a voter function similar to those used in three-
module redundant (TMR) systems.
The logic function that represents the voter is:
f(A,B,C)= AB + AC + BC
This function is also caHed the majority function since it delivers the value held
by the majority of its inputs. Figure 9 shows the binary decision diagram (BDD)
for the voter function and its implementation using multiplexers.
Three multiplexers are enough for a non-redundant implementation of the
voter. To implement a redundant version using embryonic arrays it is necessary to
observe the following: there must be one spare row and/or one spare column for
each failing ceH to be tolerated.
FoHowing this an embryonic array is constructed which is able to implement a
three-input voter function tolerating one failed ceH with a 4x4 structure.
104 A. Tyrrell

C
a) b)
Fig. 9. a Binary decision diagram for a three-input voter. b Implementation using multi-
plexers

During normal operation, aH cells are programmed to propagate signals foHow-


ing a diagonal path. This is achieved by sending west inputs through north out-
puts, and south inputs through east outputs in every cell of the array, as shown in
Fig. 10.
Transparent cells select a configuration register which allows them to propa-
gate signals in the horizontal and vertical directions instead of the diagonal. When
a failure occurs the array loses one row and/or one column.

WIN EOUT
EIBUS EOBUS

SIN STBUS
Fig. 10. Configuration of cells to propagate signals diagonally

Fig. 11 shows the final implementation of the voter. Bold arrows indicate the
flow of information. Note that propagation of the output signal also follows the
diagonal direction. The highlighted cells were programmed to implement the cir-
cuit. One advantage of this system over conventional implementations is that the
output could be routed through several cells so that the value of f is presented at
more than one output pin simultaneously, and because cells can perform self-
diagnostics, a correct output could always be selected.
A major improvement in the workings of an embryonic architecture would be
to increase the ability of the cells to self-diagnose errors before reconfiguration.
One possible way to do this might be to look towards biology again and consider
Embryonics and Immunotronics 105

how immune systems work and whether such inspiration might again be mapped
into the world of electronics. The next section in this chapter explores this in some
detail.

6 Immunotronics

Ensuring the reliability of computing and electronic systems has always been a
challenge. As the complexity of systems increases, the inclusion of reliability
measures becomes progressively more complex and often a necessity for VLSI
circuits where a single error could potentially render an entire system useless.

B C

Fig. 11. Implementation of voter function using an embryonic array

Biologically inspired systems have recently begun to be investigated for both


evolutionary and developmental approaches to reliable system design in the form
of evolvable hardware [9] and embryonics, see above [10]. This chapter demon-
strates a completely new approach that takes inspiration from the vertebrate im-
mune system to create the beginnings of a complete hardware immune system.
106 A. Tyrrell

7 Reliability Engineering

Reducing the failure probability and increasing reliability have been a goal of
electronic systems designers ever since the first components were developed. No
matter how much care is taken designing and building an electronic system,
sooner or later an individual component will faiI. For systems operating in remote
environments such as space applications, the effect of a single failure could results
in a multi-million pound installation being rendered useless. With safety critic al
systems such as aircraft, the effects are even more severe. Reliability techniques
need to be implemented in these applications and many more. The development of
fault-tolerant techniques was driven by the need for ultra-high availability, re-
duced maintenance costs, and long life applications to ensure systems can con-
tinue to function in spite of faults occurring. The implementation of a fault-
tolerant mechanism requires four stages [Il]:
• Detection of the errar
• Confinement of the errar, to prevent propagation through the system
• Error recovery, to remove the error from the system
• Fault treatment and continued system service to repair and return the system to
normal operation
The later three stages were to a greater or les ser extent considered in the first
part of this chapter; we will now deal with the detection of errors.
Any digital systems can be analysed by modelling it as a finite state machine
(FSM) with its associated state table description. In principle any sequential digi-
tal system can be modelled as FSM, or a set of interconnecting FSMs. An exam-
ple is shown in Fig.12, which shows normal states and transitions and samples of
those that could potentially lead to a system failure. The FSM is therefore an ideal
representation for deve10ping a hardware immune system.
Here we concentrate on an FSM representation as these devices are used
throughout alI stages of a sequential system design, are a source of comparison
with reliability engineering research, and are also used in hardware design pack-
ages to permit direct instantiation of their design as a complete system netlist.
What follows could, however, be used with other representations and design
methods of digital design.
Faults are represented and analysed through the use of fault models at both the
gate and functionallevel within an electronic system [12]. Gate level fault models
describe the effect of an error in terms of individual logic gates and their connec-
tions. Functional fault models check the entire function of a system at once, under
the premise that, if the functionality is correct, then the system under test is fault
free. The work presented here concentrates on the development of a novel func-
tional approach to error detection. By modelling the faults of a sequential circuit
through an analysis of the state table (that describes the functionality of the cir-
cuit) it is possible to generate tests before the circuit is even implemented. This
approach can also be used with a change to the internal architecture and logic de-
sign. This feature could be very useful with biologically inspired hardware sys-
tems.
Embryonics and lmmunotronics 107

Valid state ~- 'la ~~


te06
I

i
t e21 1 tq 12
t q40 ~

~,/ '"" \ 'q '4 "'r /


/\ 1/' q23

Valid transilion ~ Q
Fig. 12. Finite state machine representation of system

8 The Reliable Human Body

In contrast to the reliability techniques that have been developed for fault-tolerant
hardware, biology has managed to solve the problem in a remarkably different
way. The fabulously complex defence system in vertebrates has evolved over
hundreds of millions of years to give rise to what we now call the immune system
- a remarkable collection of organs, ducts, and cells comparable in complexity to
the body's nervous system [13]. The immune system is distributed, layered, and
ingenious in its methods of protecting the body from invasion by billions of dif-
ferent bacteria and viruses [14]. If one layer is penetrated, another comes into
play, presenting the invader with progressively more complex and clever barriers.
We concentrate on the acquired component of the immune system here, specifi-
cally the humoral immune response that protects the body from bacterial infec-
tion. Cells of the body and invaders, Of antigens, are distinguished as two different
entities, one should be there, and one should not. The immune system achieves
this through the concept of self/nonself discrimination. Cells of the body define
self, anything else nonself. If it gets this wrong, either way, we are in trouble!

9 Bio-Inspired Fault Tolerance

The similarities in requirements imposed on reliable hardware systems and those


already achieved by the vertebrate immune system were highlighted by Avizienis
[15]. These include: distributed detection, autonomous operation, diversity, mem-
ory, and imperfect detection. They are ali achieved by the vertebrate immune sys-
108 A. TyrreH

tem and are ideal for a hardware immune system. Many features are already ap-
plied to reliable system design. Embryonics has demonstrated one approach to
distributed fault tolerance by creating cellular electronic systems for example [lOl
If the layers of protection in the human body and existing methods of hardware
protection are compared as in Table 1, we find there is a gap that existing hard-
ware protection systems could potentially benefit from filling. One solution to
completing Table 1 is demonstrated with the development of immunological elec-
tronics, or immunotronics - the creation of an artificial hardware immune system.

10 Artificial Immune Systems

Artificial immune systems take their inspiration from the operation of the human
immune system to create novel solutions to problem solving. Although stiH a rela-
tively new area of research, the range and number of applications is already di-
verse [16, 17]. Computer security, virus protection, anomaly detection, process
monitoring, pattern recognition, robot control, and software fauIt tolerance are
some of the applications artificial immune systems are being applied too. One im-
portant feature links alI of these applications - they operate in a software domain.
Our approach demonstrates that artificial immune systems can also exist in the
hardware domain [18].

Table 1. Layers of protection in the human body and hardware

Defence mechanism Human immune system Hardware protection


Hardware enc10sure
Atomic barrier (physical) Skin, mucous membranes (physicallEM protec-
tion)
Environmental settings
Physiological Temperature Acidity
(temperature control)
N-modular redundancy
Innate immunity Phagocytes
Embryonics
Humoral immunity
Acquired immunity ?
Cellular immunity

Two distinct algorithms have emerged as successful implementations of artifi-


cial immune systems: the immune network model hypothesised by Jeme [19] and
the negative selection algorithm developed by Forrest et al. [20]. The negative se-
lection algorithm is used to differentiate between normal system operation, i.e.
self, and abnormal operation, i.e. nonself. This is achieved by generating a set of
detectors R, with each detector r E R of length l, that fail to match any self strings
S E S, also of length l, in at least c contiguous positions [20].
Embryonics and Immunotronics 109

11 Domain Mapping

In transferring the concepts from immunology to a hardware domain we adopt the


following analogies:
Self ~ normal hardware operation
Nonself ~ faulty operation
Memory T cells ~ Set of stored tolerance conditions (detectors)
Antibodies ~ State/tolerance condition comparator and response initiator
Learning during gestation ~ Generation of the tolerance conditions
Inactivation of antigen ~ Retum to normal operation
Lifetime of organism ~ Operationallifetime of the hardware
Using the FSM description of the hardware shown in Fig. 12 under normal
conditions (self) only transitions t'lX can occur. The advent of a fault could cause an
undefined transition t,., Concentrating on the transitions rather than the individual
states is very important as it then enables incorrect transitions between two indi-
vidually correct states to be detected.

12 Choice of Aigorithm

The negative selection algorithm is adopted for the hardware immune system for
two reasons:
• Complex detector set generation benefits a simple operational phase - ideal for
a hardware environment where reduced complexity simplifies the design, re-
duces component count, and promotes distribution throughout the hardware ar-
chitecture.
• Probabilistic detection permits a trade off between storage requirements and the
probability of failing to detect a nonself hardware condition. To cater for
changes in system functionality, the use of a reconfigurable platform such as a
field prograrnmable gate array (FPGA) enable the operation of a system to be
updated or completely changed. The elimination of rigid boundaries between
functional and protection is ideal, a requirement provided by probabilistic de-
tection.

13 Architecture of the Hardware Immunisation Suite

The hardware immune system is divided into two components:


• Softwarelhardware testbench for data gathering and tolerance condition gen-
eration
• The run-time hardware immune system to provide real-time errar detection to
the FSM
11 O A. Tyrrell

The softwarelhardware testbench permits data to be extracted from an already


constructed sequential hardware system where the state table description is not
fulIy known. The system to 'immunise' is inserted into a test wrapper that enables
the software to initiate a cycle of normal operation and monitor and record the
states of the hardware. The operation of this is discussed further in [21]. Self
strings are formed as in Fig. 13.

System inputs / Current state / Next state / (Outputs)


0010 /01101/01110/(101)
Fig. 13. Organisation of the strings to be protected. The system outputs may be optionally
added

Tolerance condition generation is carried out in software by application of the


negative selection algorithm using the Greedy Detector generating (GDG) algo-
rithm developed by D'haeseleer [22] during design and test. D'haeseleer showed
how optimal coverage of non self strings, or faulty operation in our case, could be
achieved with a minimal number of detectors by extracting those that match the
most nonself strings first and then those that match the most, as yet not covered
non self strings. This is critic al for an application such as this where hardware stor-
age space could potentialIy be limited. Probabilistic detection also enables high
compaction of the set of nonself strings.
Generated tolerance conditions are analysed to assess the probability of failing
to detect an invalid string. This is done on a total failure probability, detectable
over a number of cycles when an error may have propagated, and also single cycle
error detection (SCED) failure probability. Strings are single cycle detectable if
both the input and current state bits are contained within a self string, and the next
state bits contained within a non self string. By analysing the next state bits, the
SCED failure probability can also be determined. This is important for finite state
machine architectures where it is desirable to detect the presence of an error be-
fore the effects propagate.
In the operational phase, the hardware immune system acts as a wrapper, moni-
toring the system inputs, and states (and if required the system outputs) to enable
errors to be detected before the system propagates to its next state on the folIow-
ing clock edge. The hardware immune system consists of two components:
• Antigen Presenting Cell (B-cell). This extracts the data from the FSM and pre-
sents it to the T-celIs, to determine if a response should be initiated.
• T-cell storage. The tolerance conditions (detectors) are stored in a hardware
content addressable memory (HCAM) that allows parallel searching of alI
memory locations [23]. Parallei searching of alI memory locations meets the
requirement of single cycle detection of nonself strings. (In a reversal of roles,
models of the immune system have previously been used to create novel forms
of content addressable memory [24,25]).
Fig. 14 shows the hardware immune system configured to monitor system in-
puts and state. The HCAM has been developed as a generic VHDL model allow-
Embryonics and Immunotronics 111

ing re-synthesis using standard development tools to create varying sizes of mem-
ory depending on the desired storage space.
A demonstration system was synthesised for a Xilinx Virtex FPGA, 64 bits
wide, and 128 words deep to create the CAM organisation in Fig. 15. The archi-
tecture of the Virtex FPGA is ideal for constructing 4-bit CAMs using the Look-
up tables (LUTs) [26]. The LUTS are then connected together to create greater
width data. Parallel matching of aII tolerance conditions during operation ensures
single cycle error detection whatever speed the hardware system being protected
is running at. With no speed optimisations tumed on within the Xilinx Foundation
synthesis tools, the XCV300 device that contained the hardware immune system
was estimated to operate at 45 MHz. Considering operational speed for a custom
fabricated device, parallel HCAM searching ensures the system would operate at
the fully required speed of any system it was implemented to protect.
State machine Output~
(Self)

r:~
State

State recognition (B cells)


CAM scareh <Ind mask
(Nonsclf rccogl1itiol1)
Signalling
Costimulation
peptide
CAM (memory)
Tokral1C1.: eOllditiol1S
(T ecl1s)

Fig. 14. Structure of the hardware immune system, incorporating the finite state machine to
be protected
112 A. Tyrrell

Masking logic

Data word O
Mask
Data word 1

Data
Found

Fig. 15. Architecture of the partial matching hardware content addressable memory

Partial matching capabilities were further added to the HCAM so that c con-
tiguous bit matching can be implemented rather than only complete string match-
ing. A bit string mask is added that allows each bit of the tolerance conditions to
be selectively inc1uded in a matching operation, or selectively set to a don 't care
condition. With the addition of the masking bits the demonstration system built al-
lows a 32-bit width, 128-words deep partial matching HCAM as shown in Fig. 15.
Matches are selectively made by selecting c contiguous bits to require a match at
any one time.
The synthesised hardware imrnune system was applied to a state machine based
decade counter, the architecture and results of which are shown in [21,27].

14 Embryonics and Immunotronic Architecture

What is now required is to incorporate the two bio-inspired architectures into one
highly fault-tolerant system. The use of FPGAs for the implementation of both
ideas makes this easier; however, the limited resources (at least compared to biol-
ogy!) do present some implementation issues. One possible solution is shown in
Fig. 16. This uses two separate networks to transmit different types of data around
the system, in a similar way to the different comrnunication paths present in the
body.

15 Conclusion

The early history of the theory of self-replicating machines is basically the history
of John von Neumann's thinking on the matter [28, 29]. Von Neumann's automa-
ton is a homogeneous two-dimensional array of elements, each element being an
FSM with 29 states. In his historic work, von Neumann showed that a possible
configuration (a set of elements in a given state) of his automaton can implement
a universal constructor able to build onto the array any computing machine de-
Embryonics and Immunotronics 113

scribed in a dedicated part of the universal constructor, the tape. Self-replication is


then a special case of construction, occurring when the universal constructor itself
is described on the tape. Moreover, von Neumann demonstrated that his automa-
ton is endowed with two major properties: construction universality, the capability
of describing on the tape and building onto the array a machine of any dimension,
and computation universality, the capability of describing and building a universal
Turing machi ne.

Immuno-embryonics
liiiii:iiiiiJI Embryonic Cell

• Anti body Cell


Lymphatic nctwork

Communications I Embryonic Lymphatic I Trans-layer

Fig. 16. Proposed embryonic-immunotronic architecture

It must be reminded that, in biology, the cel! is the smallest part of the living
being containing the complete blueprint of the being, the genome. On the basis of
this definition, von Neumann's automaton can be considered as a unicellular or-
ganism, since it contains a single copy of the genome, i.e. the description stored
on the tape. Each element of the automaton is thus a part of the cell, i.e. a mole-
cule. Von Neumann's automaton, therefore, is a molecular automaton, and self-
replication is a very complex process due to the interactions of hundreds of thou-
sands of molecules.
Independently of the direct biological inspiration of the embryonics project, the
multicellular structure of complex organisms is of enormous interest for the con-
ception of any large-scale electronic circuit. In fact, biological organisms are
probably the most obvious examples of extremely complex machines capable of a
stunning degree of fault-tolerance. Handling complexity and tolerating faults are
two of the main challenges in the design of the next generations of electronic cir-
cuits.
Embryonic arrays exploit hardware redundancy to achieve fault tolerance.
Spare elements are incorporated at different levels of the embryonics hierarchy,
achieving the resilience of organisms to faults in their constituent molecules and
cells.
114 A. Tyrrell

This work has also demonstrated that taking inspiration from the human im-
mune system, in the form of the negative selection algorithm is suitable for the de-
sign of novel error detection mechanisms for integration into reliable hardware
system. Error detection is performed in real time. Error detection is probabilistic,
permitting a trade-off between storage requirements and the ability to detect an er-
ror within the sequential system. In contrast to existing error detection techniques
that concentrate on single bit errors, and can sometimes fail to detect multiple er-
rors, the hardware immune system is adept to detecting at this task. Generation of
tolerance conditions is still implemented in software; the unique part of this work
is the demonstration of a hardware wrapper that provides fully embedded error de-
tection using principles from immunology. The immune system is also a separate
component, permitting integration in a variety of different systems, either built
with a hardware immune system in mind, or added at a later point. Results have
demonstrated the error detecting abilities of the hardware immune system on a
range of distinctly different benchmark finite state machines.
Work in the field of immunotronics is now progressing in the Bio-Inspired Ar-
chitectures research group at the University of York into the design of microproc-
essor immune systems and distributed hardware immune systems. Immunotronics
is also currently being integrated with other biologicaUy inspired architectures
such as embryonics and evolvable hardware as part of a European Commission's
Future and Emerging Technologies project. Information is available at
http://www.poetictissue.org .

Acknowledgements. The author would like to thank a number of feUow research-


ers who have played significant roles in the development and implementation of
the ideas described in this chapter, they include: Ce sar Ortega, Daryl Bradley,
Alex Jackson, Mic Lones, Andy Greenstead, and Richard Canham. Thanks also
go to Professor Daniel Mange, without his continued support and innovative ideas
much of this work would stiU be waiting to be bom.

References

L. Wolpert. The Triumph ofthe Embryo. Oxford University Press, New York, 1991.
2 M. Nicolaidis. "Future Trends in Online Testing: a New VLSI Design Paradigm?".
IEEE Design and Test ofComputers, 15(4), 1998, p. 15.
3 D. Mange, M. Tomassini, eds. Bio-inspired Computing Machines: Towards Novel
Computational Architectures. Presses Polytechniques et Universitaires Romandes,
Lausanne, Switzerland, 1998.
4 D. Mange, M. Sipper, A. Stauffer, G. Tempesti. "Towards Robust Integrated Circuits:
The Embryonics Approach". Proceedings ofthe IEEE, voI. 88, no. 4, April 2000, pp.
516-541.
5 C. Ortega, and A.M. Tyrrell "Design of a Basic CeH to Construct Embryonic Arrays",
IEE Proceedings - Computers and Digital Techniques, 145 (3), pp 242-248, May
1998.
Embryonics and Immunotronics 115

6 C. Ortega, D. Mange, S.L. Smith, and AM. Tyrrell, "Embryonics: A Bio-Inspired


Cellular Architecture with Fault-Tolerant Properties" Journal of Genetic Program-
ming and Evolvable Machines, Voi 1, No 3, pp 187-215, July 2000.
7 R. Negrini, M.G. Sami, R. Stefanelli. Fault Tolerance Through Reconfiguration in
VLSI and WSI Arrays. The MIT Press, Cambridge, Mass., 1989.
8 Shibayama, H. Igura, M. Mizuno, M. Yamashina. "An Autonomous Reconfigurable
Cell Array for Fault-Tolerant LSIs". In: Proc. 44'" IEEE International Solid-State
Circuits Conference, San Francisco, California, February 1997, pp. 230-231 and 462.
9 AM.Tyrrell, G.S.Hollingworth, S.L.Smith, "Evolutionary Strategies and Intrinsic
Fault Tolerance", in Proceedings of the 3rd NASA/DoD Workshop on Evolvable
Hardware, pp. 98-106, July 2001.
10 D.Mange, M.Sipper, AStauffer, G.Tempesti, "Toward Robust Integrated Circuits:
The Embryonics Approach", Proceedings ofthe IEEE, VoI. 88:4, pp. 516-541, April
2000.
Il P.ALee, T.Anderson, Fault Tolerance Principles and Practice, Springer Berlin Hei-
delberg New York, 2nd edn. 1990.
12 D.K.Pradhan, Fault Tolerant Computing: Theory and Techniques - Volume 1, Pren-
tice-Hall, Englewood Cliffs, N.J., 1986.
13 N.K.Jerne, "The Immune System, Scientijic American, VoI.229:1, pp. 52-60, 1973.
14 C.A.Janeway, P.Travers, Immunobiology, the Immune System in Health and Disease,
Churchill Livingstone, 3rd edn. 1997.
15 A.Avizienis, "Towards Systematic Design of Fault-Tolerant Systems", IEEE Com-
puter, VoI. 30:4, pp. 51-58, Apri11997.
16 D.Dasgupta, N.Attoh-Okine, "Immunity-Based Systems: A Survey", IEEE Interna-
tional Conference on Systems, Man and Cybernetics, 1997
17 D.Dasgupta, N.Majumdar, F.Nino, "Artificial Immune Systems: A Bibliography", CS
Technical Report - CS-01-002, ver. 2.0, The University of Memphis Tenn., June
2001.
18 D.W. Bradley, A.M.Tyrrell, "The Architecture for a Hardware Immune System" Pro-
ceedings of the 3rd NASNDoD Workshop on Evolvable Hardware, pp. 193-200, July
2001.
19 N.K.Jerne, "Towards a network theory of the immune system", Ann. 1mmunol. (Inst.
Pasteur), VoI. 125C, pp. 373-379, 1974.
20 S.Forrest, AS.Perelson, L.Allen, R.Cherukuri, "Self-Nonself Discrimination in a
Computer", Proceedings of the i994 iEEE Symposium on Research in Security and
Privacy, pp. 202-212, 1994.
21 D.W.Bradley, AM.Tyrrell, "Immunotronics: Novel Finite State Machine Architec-
tures with Built in Self Test using Self-Nonself Differentiation", iEEE Transactions
on Evolutionary Computation Voi 6, No 3, June 2002.
22 P.D'haeseleer, "Further Efficient AIgorithms for Generating Antibody Strings", Tech-
nical Report CS95-3, Department of Computer Science, University of New Mexico,
1995.
23 T.Kohonen. Content-Addressable Memories, Springer Berlin Heidelberg New York,
2nd edn. 1987.
24 C.J.Gibert, T.W.Routen, "Associative Memory in an Immune-Based System", in
Proceedings of the 12th international Conference on Artijiciallnte/ligence AAAI-94,
pp. 852-857, 1994.
116 A. Tyrrell

25 J.E.Hunt, D.E.Cooke, "Leaming using an artificial immune system", Journal of Net-


work and Computer Applications, Voi. 19, pp. 189-212, 1996.
26 Xilinx Inc, "Virtex data sheet", 1999, http://www.xilinx.comlpartinfo/virtex.pdf
27 D.W.Bradley, A.M.Tyrrell, "Multi-layered Defence Mechanisms: Architecture, Im-
plementation and Demonstration of a Hardware Immune System", in Proceedings of
4th International Conference on Evolvable Systems: From Biology to Hardware
(ICES200I), Lecture Notes in Computer Science 2210, pp. 140-150, October 2001.
28 J. von Neumann. The Theory of Self-Reproducing Automata. A. W. Burks, ed. Uni-
versity of Illinois Press, Urbana, Ill., 1966.
29 M. Sipper. "Fifty Years of Research on Self-Replication: an Overview". Artificial
Life, 4(3) , 1998, pp. 237-257.
Biomedical Applications of Micro and Nano
Technologies

C.J. McNeil, K.J. Snowdon

Institute for Nanoscale Science and Technology, University of Newcastle upon


Tyne, Newcastle upon Tyne, NE! 7RU, UK
calum.mcneil@ncl.ac.uk

Abstract. The functional integration of man-made devices and biological systems


represents one of the grand challenges of science and technology. Efficient real-
time exchange of information and/or materials across the molecular-scale interface
between biologic al and physical systems is a core platform requirement to realise
that vision. This common technology requirement underpins development of (i)
affordable diagnostic devices that harness the full potential of genomic informa-
tion through real-time predictive, preventive, point-of-care and personali sed health
care provision; (ii) anti-terrorism, environmental, food, crime detection and proc-
ess monitoring sensors; (iii) targeted drug delivery systems; (iv) advanced ortho-
paedic and neural implants; (v) pharmaceutical screening and lab-on-chip devices;
and (vi) ubiquitous systems which monitor, interact with, and respond to biologi-
cal events, and link unobtrusively with information processing and communica-
tions systems. A thematic area common to this huge diversity of devices, applica-
tions and sectors, each of which on its own could form the subject of one or more
integrated projects, provides the focus of the Network for Biomedical Applica-
tions of Micro and Nano Technologies (NANOMED), which will not only ad-
vance the associated platform science and technology, but act to link diverse
communities.

1 Background

The dawn of nanoscale science can be traced to a now c1assic talk that Richard
Feynman gave on 29 December 1959 at the annual meeting of the American
Physical Society at the California Institute of Technology'. In this lecture, Feyn-
man suggested that there exists no fundamental reason to prevent the controlled
manipulation of matter at the scale of individual atoms and molecules. Twenty one
years later, Eigler and co-workers [l] constructed the first man-made object atom-
by-atom with the aid of a scanning tunnelling microscope. This was just 7,000
years after Democritus postulated atoms to be the fundamental building blocks of
the visible world. The field derives its name from the SI prefix nano, meaning

, http://nano.xerox.comlnanotechlfeynman.html
118 C.J. McNeil, K.J. Snowdon

1/1,000,000,000 of something. A nanometre is thus 1/1,000,000,000 of a metre,


which is around 1150,000 of the diameter of a human hair or the space occupied by
3-4 atoms placed end to end.
Nanoscale science and technology enables controlled component design and
fabrication on atomic and molecular scales. Nano-related research and develop-
ment unites findings and processes from biotechnology and genetic engineering
with chemistry, physics, electronics and materials science with the aim of manu-
facturing cost-effective innovative products.
The trend in manufacturing industry toward increasing miniaturisation, improv-
ing dimensional precision and controlling surface finish is well recognised.
Chemical, biologic al and drug sensors are being developed which combine single
molecule sensitivity, monolayer thickness biological sensors with on-chip fluid
handling and electronic read-out. Structure widths of 0.25 microns are now com-
mon in leading-edge microelectronic devices. Lithography lenses used for the
manufacture of such devices now approach nanometre-scale form precision and
sub-nanometre roughness. Microelectronic and integrated mechanicallelectronic
devices (microelectromechanical systems or MEMS) contain thin functional films
only a few atom layers in thickness. These films must contain a minimum of de-
fects and have surfaces and interfaces which approach atomic smoothness. Manu-
facturing industry is increasingly linking macro-, micro- and nanoscale technolo-
gies to produce miniaturised and inexpensive electronic, sensing and actuator
systems. In doing so, it exploits the techniques and processes of the microelectron-
ics sector, while depending on new tools, fabrication and assembly techniques
borrowed from the biological, engineering, chemical and physics communities.
These techniques increasingly demand the controlled manipulation of matter on
the atomic and molecular scale. Increasing miniaturisation is accompanied by an
irrevocable increase in the importance of mastering and reliably implementing ex-
treme nanoscale technologies in a mass production manufacturing environment.

2 Biomedical Applications of Nanotechnology

The functional integration of man-made devices and biological systems represents


one of the grand challenges of science and technology. It lies at the intersection of
the biological, physical and information worlds. Realisation of the integration of
alI of these elements would allow us in future, for example, to create a new gen-
eration of 'smart' orthopaedic implants that can promote, monitor and control
bone growth, or develop sensors and other devices able to monitor, interact with,
respond to, and modify their biologic al environment. Since such devices and sys-
tems must interact with the biologic al world on the molecular and cellular scale,
the structure and properties of the biologicallphysical systems interface must be
designed and controlled on that scale. The importance of that interface at the cellu-
Iar and sub-cellular scale, and of bi-directional communication across that inter-
face, is illustrated pictorially in Fig. 1. This is the realm of molecular design, pro-
tein engineering, and the fabrication of functional hybrid devices and structures. It
underlines the viewpoint, expres sed by the European Commission, that nanotech-
•. I 1
Biomedical Applications of Micro and Nano Technologies 119

Indirect inleiace 10 cell Indirect oeiilo inlerfoce


si gnalling

~ ~
."
~~
Direct inlerfoce 10 oeil Direct cell 10 inlerface
signalling signalling

SIGNAL I SENSOR SURFACE

Il
SIGNAL PRODUCTION ANO SIGNAL RESPONSE
(fluid, electrical and mechanica modalilies)

Fig. 1. Pictori al representation of bi-directional communication at the interface be-


tween the biological (cellular), physical and information domains

nology is not so much about a scale, but about the convergence of physics, chem-
istry and biology.
The properties and functionality of the molecular-scale interface separating the
biological and physical worlds are key to successful integration. Ideally, the prop-
erties and functions of this interface should be indistinguishable from that of
neighbouring cells on the one side, and fully compatible with fluidics, microelec-
tronics and information processing on the other. This means that it should not be
just biocompatible or bioactive, but it should also support controlled material and
information transfer across the physical/biological interface.
The goal of modem medical science is to identify disease at the earliest point in
the disease process, intervene before symptomatic disease becomes apparent, and
monitor the progress of both the disease and the effects of those intervention pro-
cedures. This demands the development of technologies capable of detecting
predisposal to disease conditions and the earliest signatures of emerging disease,
and supporting immediate, specific and highly targeted intervention. Suitable plat-
form technologies must integrate an ability to sense, communicate, respond, and
monitor. Nanotechnology, which is widely predicted willlead to the next techno-
logical revolution, with biomedical applications showing the largest potential for
growth, has the potential to provide enabling components toward fulfilment of
these goals. It thus has the potential to transform health care and medicine.
Nanotechnology increasingly enables controlled component design and fabrica-
tion on atomic and molecular scales. We now have an unprecedented ability to
construct, 'from the bottom up' truly nanoscale biological components (e.g. Fig.
2). Such nanoscale functional components can be self-assembled into larger, mul-
tifunctional systems and complex devices that integrate required functions with
the ability to encapsulate and protect a drug molecule, sense the presence of a de-
fective cell, communicate that presence, and respond. Furthermore, nanotechnol-
ogy enables the precise structural and functional characterisation of such compo-
nents and their interactions with each other and with their environment. In paralIel,
120 C.J. McNeil, K.J. Snowdon

bioscientists are devising powerlul techniques to design and re-engineer proteins


and other functional components derived from living systems, and chemists are
increasingly able to create complex molecules [2,3]. This is all happening as the
Human Genome and other sequencing projects on genomes of known pathogens
(such as tuberculosis) are generating tens of thousands of new and highly specific
targets for disease diagnosis and drug therapy.

Biological
ceL! ceL! } System
Topograpby }
Cbemistry '---...,..,...-1,.,.- ,..,...--=,.......,..
Protein Engineeriog I } Nanoengineered

1
Topograpby } Bio-interface
Chemistry ..:....:...:........:'-f:........:--'-':........:c....:.,-:........:-=....:.....:..:........:....;:..:..:........:...."+....:....~
Protern Eogineering
PhysicaJ
System

Fig. 2. Micro- and nanacomponents of the physicalJbiological interface

We are now faced with both the opportunity and challenge to integrate these
unprecedented developments in human knowledge across the life sciences, the
physical sciences and engineering, and so enable a step change in our ability to
address clinical priorities such as early, rapid, minimally invasive and highly tar-
geted intervention for cancer, immunological and genetic diseases and HIV. To
capture and exploit, in an effective and coherent manner, synergistic developments
associated with the post-genomic era are key priorities of all three UK Research
Councils (EPSRC, BBSRC and MRC). To reali se the potential offered by these
parallel developments requires not only research, but also the creation of a new
generation of researchers, who can work across traditional disciplines, and who
also know how to work with others at the inteifaces between disciplines. Bio-
medical nanotechnology seeks to shape the next generation of clinical therapeu-
tics, monitoring, diagtlostic and research tools through appropriate and effective
utilisation of nanotechnology, which many predict will be the next technological
revolution. This multidisciplinary approach could revolutionise patient care.

3 Developing a Multidisciplinary Base - The NANOMED


Network

The Network for Biomedical Applications of Micro and Nano Technologies


(NANOMED) was formally established on January 15th, 2000 and is co-ordinated
by the Institute for Nanoscale Science and Technology (INSAT) at the University
of Newcastle upon Tyne (http://nanocentre.ncl.ac.uk). Like the CytoCom Network
on which this book is based, NANOMED has received seed funding from the U.K.
Engineering and Physical Sciences Research Council (EPSRC) to assist its estab-
lishment.
Biomedical Applications of Micro and Nano Technologies 121

The specific aims of the Network are to seek to combine nanotechnology with
powerful molecular design and protein engineering techniques to develop nanoen-
gineered, robust, communicative biologic al interfaces to the physical and informa-
tion worlds. Since the technology must interact with the biological world on the
molecular and cellular scale, the structure and properties of that interface must be
designed and controlled on that scale. This is the realm of molecular design, pro-
tein engineerit1g, nanotechnology, and the fabrication of functional hybrid devices
and structures. These will be biocompatible, robustly attached to the non-
biological surface, and vossess application-specific components permitting mate-
rial and information transfer across the interface. The resulting platform technolo-
gies will enable us to hamess the full potential of genomic information through
real-time predictive, preventive, point-of-care and personali sed health care provi-
sion, and enable major and affordable advances in orthopaedics and trauma,
pharmaceutical screening devices, environmental and process monitoring, intelli-
gent communications systems, forensics and defence.
It is widely acrepted that the combination of nanotechnology and biomaterials
science will lead to the next technological revolution, with biomedical applica-
tions showing the largest potential for growth. Nanotechnology enables comvo-
nent design and fabrication on atomic and molecular scales and the self-assembly
of such components into larger, multifunctional systems. Furthermore, it enables
the precise structural and functional characterisation of such components and their
interactions with each other and with their environment. This Network will com-
bine macromolecular chemistry, nanotechnology, and powerful methods to re-
engineer proteins and other functional components derived from living systems.
This will result in techniques, devices and systems that can efficiently monitor
molecular interactions within and between cells, providing real-time single mole-
cule sensitivity and personali sed diagnostic tests, more rapid pharmaceutical
screening procedures, and prompt detection of biologic al contamination. The de-
velopment of techniques that would allow us to influence molecular interactions in
real-time would enable us to create e.g. orthopaedic implants able to respond to
and control bone formation. Common to all these examples is the need to create,
and in some cases communicate across, an interface between biologic al and physi-
cal environments.
The recent completion of the Human Genome Project has provided data on the
sequence of the approximately 30,000 human genes. This is a major achievement,
but is in reality merely an entry into the post-genomic age of applied genomics
and proteomics. Understanding the structure and function of these genes and the
proteins that they encode will potentially revolutionise our understanding of hu-
man disease. Pervasive real-time application of that understanding, which this pro-
ject would enable, will revolutionise treatment. The grand challenge identified
within our strategic vis ion appears tantalisingly within grasp following recent ad-
vances. Only with such technology can we interface the 'virtual information' of
the human genome with e.g. the physical reality of prediction, diagnosis and
treatment in medicine.
122 C.J. McNeil, K.J. Snowdon

4 Initial Challenges to NANOMED Problems

The initial challenges for Network research activity have been identified as at-
tempting to integrate microsystems technologies, which offer revolutionary possi-
bilities for sensors and sensor arrays for drug screening, with nanoscale assembly
components and in integrating such components into ' smart' therapeutic delivery
vehicles (Fig. 3).

nuclear enhy
protection component
ofDNAfrom
endosomolytic
degradation
component

optimal size
for cellular uptake

condensing targeting
agent with optimal moiety
binding,
biocompatible protection from
opsonization
Fig. 3. Schematic representation of a nano-engineered multicomponent therapeutic delivery
system

It has been recognised that an increasing number of ceH membrane channels are
found to act as targets for medicines and other compounds. The ability to detect
the modulation of ceH membrane channel activity by the binding of therapeutic
agents is considered crucial for rational and efficient drug discovery and design.
Since combinatoriallibraries of potential therapeutic compounds are rapidly grow-
ing, fast and highly sensitive methods for functional drug screening are required.
An attractive possibility is the use of self-assembled tethered membranes contain-
ing specific channel receptors as the sensing element in an otherwise solid-state
biosensing device. Massive arrays of individually addressable microsensors with
integrated fluid hand ling are conceivable. Even very simple sensor designs offer
valuable advances in low-cost sensing for clinical medicine (especiaHy point-of-
care) and the food and hygiene sectors.
Biological sensor systems are predicted to play an important role in preventa-
tive medici ne and early diagnosis. The much-discussed 'artificial nose' containing
a dense array of receptor sites affording unambiguous identification of molecular
species could analyse the breath of patients for known chemical signatures of dis-
eases such as liver cirrhosis and lung cancer [4]. Applications of such devices as
Biomedical Applications of Micro and Nano Technologies 123

chemical sensors in industry, food processing, military combat situations and envi-
ronmental monitoring are promised.
Recent developments [e.g. 5] in scanning probe based microscopies (e.g. AFM,
STM, SECM) and molecular manipulation techniques such as 'optical tweezers'
[6] promise further advances in determining the structure of cell membranes on
the nanometre scale, providing a potential key to understanding processes such as
immune reactions and the development of viral infections. An ability to perform
controlled manipulation procedures on DNA, the creation of artificial enzymes,
proteins and ribozyme catalysts would open a wide range of speculative possibili-
ties. In addition, micro- and nanoscale device fabrication technologies have a po-
tentialIy important role to play in clinic al medicine, psychology and the behav-
ioural sciences in enabling the development of sophisticated, minimalIy invasive,
remotely addressable, yet affordable, sensors for quasi-real-time recording of neu-
rological activity and other biologic al functions.
FinalIy, genetic tests have almost unparalleled scope in medicine and biotech-
nology. With applications in genetic diagnostics in human and animal subjects and
in the detection of pathogens, the demand for genetic information is essentialIy
unlimited and determined only by the cost of information retrieval. The DNA se-
quence for a significant fraction of the 3.3 billion nucleotide base-pairs in the hu-
man genome has already been determined, in principle providing sufficient infor-
mation to measure the altered celIular patterns of expression of these genes in
different physiological conditions and disease states, including cancer. Rapid, in-
expensive methods of detection of these are eagerly sought, to enable clinicians to
identify agreat many pathological conditions with greater speed and certainty.
Dense microarrays of up to 100,000 oligonucleotides are coming into use in ge-
netic research, particularly in the pharmaceutical industry. Much effort is being
devoted to developing compact, inexpensive microarray-based systems which
could ultimately accommodate, on a single substrate, alI processing functions for
DNA amplification, separation, hybridisation and detection of picolitre sample
volumes.

5 Concluding Remarks

The integration of biotechnology, information technology, cognitive science and


nanotechnology is viewed in North America, the Asia-Pacific Rim and Europe as
key for the creation of new scientific and industrial fields, and to enable progres-
sion of advanced industriali sed nations. Europe has international strengths in bio-
technology, pharmaceuticals and some 'front-end' sensor technologies. It is
weaker in mass-market electronics and communications technologies. In contrast,
Japan is strong in the latter areas and Japanese electronics and communications
companies have recognised the huge potential markets associated with ubiquitous
sens ing capability and the importance within that of information contained within,
derived from, or impacting upon biological systems. Indeed, biomedical applica-
tions are recognised by most industry analyses (e.g. NEXUS) as providing, after
ICT, the largest and quickest wealth creation opportunities for microsystems and
nanotechnology.
124 CJ. McNeil, KJ. Snowdon

A key prerequisite for many of these ubiquitous systems is the development of


devices that enable reliable real-time communication between physical and bio-
logical systems. Key concepts include biomimetics, the incorporation of biologi-
cal molecules into otherwise electronic devices, mimicking biological structures in
fabricated devices, and the incorporation of biological signal processing strategies
in the logic of electronic systems and communications networks. The properties
and functionality of the molecular-scale interface separating the biological and
physical worlds are key to successful integration. Ideally, the properties and func-
tions of this interface should be indistinguishable from that of the neighbouring
biological milieu on the one side, and fully compatible with fluidics, microelec-
tronics and information processing on the other. This means that it should not be
just biocompatible or bioactive, but it should also support controlled material and
information transfer across the physicallbiological interface. It is the pervasive
real-time application of understanding that inteiface, requiring the technology
platform at the very heart of this Network, which will revolutionise diagnosis,
treatment and prevention.

References

1. Eigler D.M., Schweizer E.K. (1990) Positioning single atoms with a scanning tunnel-
ling microscope. Nature 344,524-526.
2. Lakey 1.H., Reddy B.V., Murra)-Rust 1. (2001) Macromolecular assemblages. Theory
and simulation. Curr. Opin. Struct. Biol. 11, 139-140.
3. Davis B.G. (2002) Synthesis of glycoproteins. Chem. Rev. 102, 579-602.
4. Thaler E.R., Kennedy D.W., Hanson C.W. (2001) Medical applications of electronic
nose technology: review of current status. Am. J. Rhinol. 15, 291-295.
5. Blackley H.K.L., Davies M.C., Sanders G.H.W., Roberts CJ., Tendler SJ.B., Wilkin-
son M.J., Williams P.M. (2000). In-situ atomic force microscopy study of beta-
amyloid fibrillization. J. MoI. Biol., 298, 833-837.
6. Meiners J.c., Quake S.R. (2000) Femto-Nev/ton force spectroscopy of single ex-
tended DNA molesules. Phys. Rev. Lett. 84,5014-5017.
Macromolecules, Genomes and Ourselves

S. B. Nagl
Department of Biochemistry and Molecular Biology, University College London,
Gower Street, London WClE 6BT, UK
nagl@biochemistry.ucl.ac.uk

J. H. Parish
School of Biochemistry and Molecular Biology, The University of Leeds, Leeds
LS2 9JT, UK

R. C. Paton!
Department of Computer Science, The University of Liverpool, Liverpool L69
3BX, UK

G. J. Wamer
Unilever Research Colworth, Colworth House, Sharnbrook, Bedford MK44
lLQ,UK

Abstract. 'Bioinformatics' is used to describe computational topics in molecular


and cellular biology. As a discipline it involves cross-fertilisation of ideas between
computer science and modem biology. DNA, RNA and protein are c1asses of mac-
romolecule whose members play several roles inc1uding inheritance, biologic al in-
formation processing, signal transduction and catalysis. Methods of c1assifying
these molecules are central to current methods for elucidating relationships be-
tween sequence, structure and function. We take as a case study metaphors for the
function of proteins and point to a unified view of proteins as computational de-
vices capable of matching patterns as inputs and processing to result in alternative
outputs. Finally we consider the requirement for a systems view of life in order to
construct new models for the era of post-genomic biomedicine. The subject has an
ethical dimension and we consider the case that such models are metaphoric con-
structions.

1 Preamble

The advent of structural molecular biology and the development of rapid methods
for macromolecular sequencing coincided roughly with the development of digital
computers. Molecular biologists have thus been computer users for several dec-
ades. More recently computational molecular biology has been recognised as a
methodological discipline in its own right and has led to the neologism, 'bioinfor-
matics'. The subject has represented a productive cross-fertilisation of computa-
tional and biological ideas. Arguably it started with a computational solution to
the general problem of aligning strings with the concomitant insertion of gaps by
126 S. B. Nagl et al.

an efficient method: this method, variously referred to as 'the Needleman and


Wunsch algorithm' or 'dynamic programming', dates from 1970 and was designed
specifically for scoring similarities between protein sequences. More recently, ge-
netic recombination, molecular evolution and the antigen-antibody interaction
have given rise to familiar biological metaphors in computing. Equally, as the data
sets interrogated by molecular biologists and biochemists have become ever lar-
ger, computer science methods including data mining, agent-based methods,
automated annotation and rapid clustering algorithms are increasingly required for
biochemical and molecular biological research. For newcomers to macromolecular
science, even a modest web site such as bbsrc-bioinf.leeds.ac.uk!BIOINFlbioinfor-
matics.html will give a flavour of current issues in computational molecular biol-
ogy and lead to an impression of the size of the data sets that are in place.
In this chapter we aim first to review the biological perspective of computa-
tional molecular biology, second to consider the metaphoric abstraction of protein
functions and third to highlight issues in the integration of molecular and cellular
considerations as applied specifically to the impact on our own species.

2 Macromolecules: Properties and Classification

2.1 Architecture, Form and Function

Biological macromolecules are certainly not the largest molecules in the world:
for example a diamond is just one molecule of carbon. However biological mac-
romolecules display a special type of complexity. The structure of a diamond is
based on a simple rule derived from the properties of a saturated tetravalent car-
bon atom. In contrast, aIthough biological macromolecules are based on relatively
simple chemical building blocks such as amino acids and nucleotides, their chemi-
cal properties are dominated by the chiral (asymmetric) nature of the building
blocks. Chemical interactions involving biologic al macromolecules are highly se-
lective and specific. This specificity extends to the interactions that form the rules
that govem the ways in which the macromolecular chains can fold to generate the
conformations (shapes in three dimensions) that are characteristic of these mole-
cules. Although some of the rules are still to be elucidated, one important gener-
alisation is valid: the complexity associated with emergence of nucleic acids and
proteins led to concomitant emergence of a new property - a type of chemically
encoded information. The information includes rules for the folding of a protein
(encoded by the protein sequence itself) and the information for the encoding and
regulated expres sion of genes in DNA (RNA in certain viruses). In this latter case
the proteins involved in the regulation and expression processes can be regarded
as decoding machines and the DNA (or RNA) as an equivalent of a program con-
taining instructions and data. In Sec. 3 we pursue the idea that proteins process in-
formation in the case of enzymic and other functions. Biological macromolecules
are involved in the regulation of cellular activity. Biologically they can be re-
garded as having several roles: structural, regulatory, catalytic and genetic. In this
context we use 'genetic' to refer to the inheritance, transmission and expres sion of
Macromolecules, Genomes and Ourselves 127

genetic information. We note that any one macromolecule can fulfil more than one
of these roles, not simply because of possible ambiguity in the definitions of the
roles (the last three are ali examples of informat ion processing) but a single mac-
romolecule can have different roles: for example, ribosomal RNA is a component
in gene expression, is structural and is an enzyme (catalyst). In this chapter we ex-
clude certain classes of macromolecule, polysaccharides, lipids, lipopolysaccha-
rides, and concentrate chiefly on proteins but in this introduction we put them into
context with DNA and RNA. A biochemist's summary of the roles of these mole-
cules is given in Table 1.

Table 1. Biological Functions of DNA, RNA and Protiens

Roles DNA RNA Protein


As genetic ma- Cells and DNA RNA viruses, vi- Prions
terial viruses roids'
Structural roles In chromatin and In ribosomes Many proteins are struc-
its bacterial tural
equivalent
Storage Plant seed proteins are re-
serves
As data Genes Genes in RNA vi- The sequence determines
ruses and as mes- the fold
sengerRNA
As catalysts Ribozymes Most enzymes "
As regulators of Several examples Many examples of DNA-
gene expres sion or RNA- binding proteins
In signal trans- DNA is the UV As receptors and mediators
duction receptor for reac- of most biological signal
tions that lead to transduction
its own repair
Repair" Many pathways Aminoacyl-tRNA
As machines " In ribosomes Many examples
As decoding In ribosomes Polymerases, aminoacyl-
devices' tRNA synthases, in ri-
bosomes

Notes for Table 1.


" These are rather special examples of macromolecules (viroids are small in-
fectious RNA molecules that cause certain diseases in plants) that affect
their own synthesis from cellular genes
h A small caveat is that the folding process is frequently accelerated by

other proteins ('chaperones' Of 'chaperonins')


, An enzyme is a catalyst that acts as a transient component in the reaction
mechanism with the result that the rate of reaction is saturable. In the case
128 S. B. Nagl et al.

of a 'elassical enzyme', if we are looking at the rate of conversion ('reaction


velocity', v) of a substrate (S, 'reactant' in chemistry) to product(s) is:
v =V_.[S]/(Km + [S])
where Km is a measure of the affinity of S for the enzyme and Vmax is a meas-
ure of the quantity of enzyme in the ceH, test tube, etc
d We foHow biochemists in regarding repair as a process in which a macro-

molecule that has suffered damage (a chemicallesion of some kind) is re-


paired without recourse to destroying the damaged molecule and resynthe-
sising it. Given this definition only DNA, tRNA and the bacterial ceH wall
(not featured in the table) are truly repaired
eHere we refer to processes that involve the expenditure of chemical free
energy to achieve mechanical work on either the sub-cellular (e.g. move-
ment of ribosomes, segregation of chromosomes) or organismic (e.g. mus-
ele contraction, collapse ofleaves in sensitive plants) scale
'UNA sequences can be written with of four letters, RNA can be also be
written with four letters and proteins can be written with 20 (actually 21)
letters; biochemists use the ideas of transcription and translation as the de-
coding of DNA to RNA and RNA to protein respectively. The sequence re-
lationship between RNA and proteins is referred to as the Genetic Code
Proteins are composed of amino acid residues and their sequences can be repre-
sented as strings of 21 or more characters. Active proteins have characteristic 3D
structures that we refer to here as 'folds'. There are no reasons to believe that in-
formation other than that in the sequence is needed to achieve the folding process.
One of us [1] has proposed that a protein sequence might be regarded as a sen-
tence with a 3-dimensional meaning. There is a growing understanding of the
mechanism of folding, 'the foldi;:g pathway'. Modem techniques have revealed
such pathways and have led to the discovery that for many proteins there are alter-
native pathways and this is another example of a biological generalisation: bio-
logical pathways are redundant and robust. Such pathways may develop emerging
behaviour: in the case of protein-folding we cannot detect this behaviour except
that we know of cases where protein-folding anomalies can occur. Apart from
studies on model or 'designer' proteins, there are several examples of protein fold-
ing diseases. Two of the most familiar are Alzheimer's disease and the encephalo-
pathies. In the former case, the mis-folding of a protdn, called 'amyloid', results in
the formation of plaques characteristic of Alzheimer's histopathology: in the case
of the infectious encephalopathies, the protein-folding anomaly becomes transmis-
sible because the new fold affects the folding of the prion gene product and the
expression of the prion gene.
Proteins are linear polymers containing various combinations of the 20 amino
acids, the sequence of which is referred to as the primary structure of the protein
and forms the regular hydrogen bonded structures, in particular (X-helices and ~­
strands, that make up the secondary structure of the protein. Experimental studies
indicate that secondary structure collapses to create supersecondary structural mo-
tifs that are the earliest segments of the polypeptide to fold and remain stable dur-
ing the folding process [2-5]. These findings suggest that proteins do not explore
large numbers of possible conformations, rather they are limited to a set of topolo-
gies imposed by the collapse of the secondary structure.
Macromolecules, Genomes and Ourselves 129

However, when we say that DNA contains what is necessary to specify a living
being, we divest this ceH component of its interrelations with the rest of the net-
work. It is the network of interactions in its entirety that constitutes and specifies
the characteristics of a particular ceH, and not one of its components. That modifi-
cations in those components called genes dramatically affect the structure is very
certain. The error lies in confusi.lg essential participation with unique responsibil-
ity [6]. This has implications for biomedical science and we return to this topic in
the final section.

2.2 Data Resources

The sequences of many proteins are known and the majority of these sequences
have been deduced from DNA sequencing studies, notably from the many genome
projects. Several databases are non-redundant, in other words proteins that are
identical or almost identic al in closely related species (e.g. human beings and
chimpanzees) are only represented once. Some proteins are of known structure
and others are of known function (these two sets intersect). Protein sequence data-
bases contain annotations about the source of the protein and such structural and
functional details as are known or have been deduced.
The PIR-International Protein Sequence Database (PSD) [7]is maintained by an
international consortium comprising the Protein Information Resource (PIR) in the
United States, Japan and Germany. As of September 2003 it contained over
142,000 annotated, non-redundant protein sequences. NRL-3D [8] is a supplemen-
tary database to PIR and is produced from sequence and annotation information
extracted from the Brookhaven Protein Databank (PDB) of three-dimensional
structures [9].
SWISS-PROT [10] is a protein sequence database and is the result of coHabora-
tion between the Department of Medical Biochemistry at the University of Geneva
and the European Molecular Biology Laboratory (EMBL). Translated EMBL
(TrEMBL) is a computer annotated supplement to SWISS-PROT in 1996 [l0]
From these several composite databases have been constructed.

2.3 Protein Classification

Much contemporary research in computational biochemistry addresses the prob-


lem of deducing from a sequence the predicted struCture and function. Like their
antecedents in biology, biochemists have been preoccupied with classification
without necessarily having thought very carefully about ontologies and evolving
criteria. An earlier example was the development of 'E.C.' numbers, created by the
Enzyme Commission to create a four-digit identifier (starting with 1.1.1.1 which is
a1cohol dehydrogenase). A similar approach has been used for protein sequences.
AU protein sequences can be clustered by pair-wise alignment to create a cladistic
or phylogenetic tree that shows the divergence or evolution of a sequence. We
point out at this stage that a striking feature of these trees is that, although they
correspond roughly to the divergence of species during evolution, the time-scales
130 S. B. Nagl et al.

vary enormously from protein to protein (or indeed RNA to RNA). In other words
if, in some sense, the evolution of an organism is a summation of the evolution of
its macromolecules, the 'macroscopic' rules (for the cell, the organism) are con-
strained in a different way from the 'microscopic' rules for the molecules.
By using either computational methods for structural three-dimensional alig-
ment, automated analysis of elements of the structure, inspection by human ex-
perts, or a combination of these methods there are three alternative classifications
of proteins of known structure and/or function. SCOP [11] organises proteins in a
hierarchy from class, down through fold and superfamily to family. CATH [12]
employs a hierarchical classification system and a greater degree of automation.
Proteins are grouped on the basis of sequence similarity and a representative of
each group is taken and divided into domains using three automatic domain-
assignment techniques. We note that biochemists use the word 'domain' to mean a
module of a protein structure that they know or believe to be partly independent.
FSSP [13] also relates protein structures on an evolutionary basis, but is a fully
automated procedure and does not assign proteins to classes, folds or families. Ex-
haustive pair-wise structure comparisons are made and the results represented as a
fold tree, which is generated by hierarchical clustering, and as a series of structur-
ally representative sets of folds at varying levels of uniqueness.

2.4 Protein Signatures

A signature is some sort of restricted part of a protein sequence that relates the
protein to a position in a structural/functional classification. A short signature in
this sense is referred to as a 'motif. One approach is to rely only on the sequence
as a source of the signature. In this approach the sequences can be aligned or ana-
lysed in some way to create the signature. Such signatures can be used to construct
'secondary databases'. PROSITE [14] was one of the first secondary databases to
be developed. The rationale behind the content was that the single most conserved
motif observable in a multiple alignment of known homologues could be used to
effectively characterise a protein family, the motifs themselves usually encoding
vital biological functions. The motifs are represented as regular expressions. Thus,
by searching the database, a new sequence could be allocated to a family, or the
domains or functional sites that the sequence contained may be elucidated.
The use of several motifs to characterise a protein family is preferable to using
just one because of the possible benefit to diagnostic performance. For example, a
protein fingerprint [15] offers greater diagnostic reliability over single-motif
methods because if a query sequence fails to match all the motifs in a given fin-
gerprint, the remaining motifs will stiH allow a fairly confident diagnosis. The da-
tabase is called PRINTS-S.
Profiling provides a means of detecting distant sequence relationships in cases
where regular expressions wiH not provide very good discrimination. The com-
plete sequence alignment effective1y becomes the discriminator; the profile is
weighted to indicate where insertions and deletions are allowed, which residues
are allowed at each position and where the most conserved regions are. The pro-
files can be encoded in the form of Hidden Markov Models (HMMs), con si sting
Macromolecules, Genomes and Ourselves 131

of linear chains of match, delete Of insert states which can be used to represent the
sequence conservation within aligned families. An advantage of HMMs is that
they are not limited to searches for motifs. Pfam [16] is a database of protein do-
main families containing curated multiple sequence alignments for each family
and a collection of HMMs for finding these domains in new sequences. In Pfam-A
a collection of hand-edited seed alignments is used to build HMMs using the
HMMER package, to which sequences are automatically aligned to generate final
full alignments. If these do not produce diagnostic HMMs the process is iterated
until a good result is achieved. The seed and full alignments are then coupled with
minimal annotations, database and literature cross-references, and the HMMs
themselves.
An alternative approach is to use as a starting point detailed analysis of those
proteins whose structures are known and thus to generate sparse signatures. Sev-
erai methods [17-21] for the sparse representations of secondary structures are
published. Daniel et al. [22] reali sed that such signatures could be very fuzzy se-
quence-Iength 'motifs' but as these could not be represented as regular expressions
a new alignment algorithm was needed.

3 Models and Metaphors

3.1 Proteins as Machines

The distinction between enzymes and other functional classes of macromolecule


(Table 1) is probably misleading. The fundamental molecular architecture of
members of several structural families of proteins is common and there are several
examples of such families containing members with different roles. Aiso the con-
cept of 'signal transduction' is used in Table 1 to refer to the 'classical' biologic al
examples of external signals that affect the behaviour of a cell or tissue: such ex-
amples include the molecular basis of vision, the responses to hormones, taxis in
micro-organisms and responses to mating pheromones. Signalling networks are
discussed in Box 1. The idea that enzymes can be regarded as machines was sug-
gested by Changeux [23] and has been extended to a general view of enzymes that
can be extended further to many macromolecular functions (Fig. 1) [24].
132 S. B. Nagl et al.

INPUT
PATTERN OUTPUTS

substrates

~
..
~
catalyse

-.-...
Pattern Processl
~ Recognition Compute switch
~
~ ... modulate
~ ... integrate

Fig. 1. The properties of an information processing enzyme [24]

3.2 Information Processing by Proteins

The idea of ceH-as-text has been used to address questions related to reduction be-
tween (say) cytology and molecular biology [25, 26]. Many tools in molecular bi-
ology, both experimental and conceptual devices, have the same effect, namely
they dissect, cut open and reduce the ceH as a whole system into bits and pieces.
An important conceptual anei maybe also perceptual challenge is how to think
about the integrative whole instead of or as a complement to the bags, chips, gels
and spectra of the parts. This section will address these issues with an examination
of the metaphors of cell-as-text and cell-as-ecology.
Proteins exhibit sophisticated information processing capacities. For example,
enzymes can display pattern recognition, memory capacity, context-sensitive ac-
tivity, handling of fuzzy internal and external events, switch-like behaviour, inte-
gration of a number of metabolic pathways and other processes and signal ampli-
fication. It has been argued elsewhere [27] that it is possible to think of enzymes
as playing the central role of verbs in the cellular metabolic and information proc-
essing systern. Like verbs, enzymes can be said to have cases (in the sense elabo-
rated by Fillmore [28]). Within the context of this linguistic metaphor, enzyme
cases would include substrate, product, regulator(s), locations, associations, co-
agent(s) and target site(s). Certain enzymes that exhibit the mood- or voice-like
properties of verbs can be related to their context-sensitivity and also to their in-
ternal configuration and locali sed interactions. The notion of modality in enzyme
action is implicit in a number of recent descriptions including: the fluctuating en-
zyme [29]; the seed-germination model [30] and enzymes as logical agent/verb
[26].
A number of glycolytic enzymes are sensitive to the micro-environments and
ceH types in which they are found. For example hexokinases, which catalyse the
phosphorylation of hexose sugars, come in various isoforms including brain
hexokinase (BHK) and glucokinase (liver). Certain metabolic conditions affect
BHK behaviour with rapid and reversible changes between soluble and particulate
Macromolecules, Genomes and Ourselves 133

fonns in which the latter is more active than the fonner. Another context-sensitive
glycolytic enzyme is phosphofructokinase-l (PFK-l) which catalyses the phos-
phorylation of fructose 6-phosphate (F6P) to fructose 1,6-bisphosphate (Fl,6BP).
In this way PFK-l exhibits sigmoidal kinetics when in free solution but a nonnal
saturation curve when membrane-bound [31]. A special case of a context-sensitive
enzyme that has 'voices' is 6-phospho-fructo-2-kinase/fructose-2,6-
bisphosphatase (6PF2K1F2,6BP). This enzyme catalyses two opposing reactions:
F6P+ ATP F2,6BP+ ADP
F2,6BP F6P+ ATP
As Pilkis el al. [32] note, 6PF2K1F2,6BP has, in addition to a catalytic role, a
key function in intracellular signalling.
Intracellular signalling systems employ inter-communicating networks of
kinases and phosphatases. These enzymes are often multi-functional and highly
integrative. Consider calmodulin-dependent protein kinase II (CaM Kinase II), a
large multimeric and multifunctional enzyme that is derived from four genes. It
acts on upwards of 49 substrates and is very common. CaM Kinase II exhibits a
memory capacity as well as being able to amplify signals. Four functional do-
mains are defined: catalytic, regulatory (inhibitory and CaM binding regions), as-
sociation (with other subunits) and variable (for targeting and localisation). Apart
from the variable domain, the other three domains are highly conserved.
An examination of the eukaryotic transcription factors CBP and p300 provide
further insights into 'glue' relations. CBP/p300 are large multi-functional proteins
that participate in various basic cellular functions, including DNA repair, ceH
growth, differentiation and apoptosis. They act as focal points for multiple pro-
tein-protein interactions and co-activate many other transcription factors including
CREB, nuclear receptors, signal transducer and activator of transcription (STAT)
proteins, p53, and the basal transcription proteins. Using the review by Giles el al.
[33] it is possible to talk about CBP/p300 acting like 'glue' in five ways that relate
to (1) molecule-molecule bindings and interactions, (2) enzymatic processes, (3)
as a physical bridge between various transcription events, (4) acting as histone
acetyltransferases (HATs) -linking transcription to chromatin remodelling - and
(5) mediating negative and positive crosstalk between different signalling pathways.
It is important to note that, from our point of view, this 'glue' is not just that the
molecules have intrinsic adhesive properties, they also provide the ceH with com-
binatorial and cohesive properties at a functionallevel.
Fig. 2 summarises some of the relations between 'gluings' and function for
CBP/p300. Object and process can be subsumed as one general tenn 'glue' with
respect to the multi-functionality of these proteins. What is more, it is possible to
introduce the topological thinking of local ~ semi-Iocal ~ global into the context
in which the 'glue' functions. This relates to the tenns in the boxes on the left of
the diagram. Many verbs can be related to CBP/p300 action, such as bind, interact,
process, bridge, act, link, transcribe, remodel, mediate, talk and signal. The con-
textual and 'gluing' descriptions in Fig. 2 can be used to organise function descrip-
tions in tenns of integration within the hierarchy (context) and interaction across
levels (glue).
134 S. B. Nagl et al.

CONTEXT GLUE FUNCTION

Crosstalk

Regulation

Bridging

HAT

Fig. 2. Some 'gluings' associated with CBP/p300

Fig. 2 is a network that summarises the gluings associated with a family of pro-
teins that participate in several cellular activities. Such a convention could be used
for describing several cases: the complex regulation of nitrogen assimilation in en-
teric bacteria involves glutamine synthase which is not only an enzyme but is in-
volved in the regulation of its own biosynthesis or the pCI protein of the bacterial
virus lambda which operates as an repressor of the expression of certain genes in-
cluding its own and is also an activator of its own expres sion. Multifunctionality
can arise in multidomain proteins (Sec. 2.3) in which the domains have different
functions. An example of medical importance is the thymidylate synthase-
dihydrofolate reductase protein in the malarial parasites. This protein which arose
from a gene fusion associates two different enzyme functions with adjacent steps
in a metabolic pathway required for DNA synthesis. In the next section we address
the problem of more complex systems. Ultimately descriptions such as Fig. 2 may
provide a formal representation. At present the data set is far too incomplete: there
are many examples of multifunctionality yet to be discovered and therefore we use
a more coarse grained model.

4 Modelling of Complex Cellular Systems for


Post-genomic Biomedicine

4.1 Introduction: A Systems View of Life

We have entered the 'post-genome era' and, as techniques for the investigation of
cells at the systems level become more and more refined, are witnessing a revolu-
tionary reorientation in the conceptual foundations of biomedicine. Questions
about the molecular nature of disease can now be asked at the level of differential
activation states of the genome 'transcriptomics' [34], populations of gene prod-
ucts, including splice variants and post-translationally modified versions 'pro-
teomics' [35, 36], or functional interactions in large networks (for example [37]
Macromolecules, Genomes and Ourselves 135

and Fig. 3). In complementary fashion, the structural organisation of cell systems,
from single biomolecules, to molecular complexes, cellular compartments, and in-
dividual cell types, is also being revealed at greater and greater resolution.
Cells are increasingly coming to be seen as systems of interacting structures
with information-processing capabilities. Some of the novel questions rai sed are,
for example: What can global gene expression pattems in model organisms teach
us about the order and logic of the genome? How do genomes and other biologic al
systems evolve? How do large-scale networks of molecular interactions integrate
biologic al signals within cells? How does cellular localisation affect protein func-
tion? It is to be expected that, in the future, the systems properties of biomolecules
will also become a major focus for new research.
"Systems thinking" goes hand in hand with a focus on information processing.
Ideally, explanations are sought as to how the spatial and temporal organisation of
mixed populations of molecules gives rise to the 'smart' properties of biological
systems. Such "smartness" is not restricted to high-level processes like conscious-
ness, but is in evidence at allievels of biological organisation. Currently, much in-
terest is directed towards signal processing by complex interaction networks (Box
1). At the level of single proteins, diverse processes such as signal-mediated acti-
vation of proteins by conformational change, or the parallel-distributed nature of
co-evolutionary mechanisms (Box 2), Can also be approached from this perspec-
tive.
Whilst all biological processes are consistent with the physical and chemical
laws of our universe, and in this sense Can ultimately be 'reduced' to chemistry
and physics, there is a growing awareness that biological phenomena require an
approach that equally addresses the problem of emergence. How do living systems
emerge from the laws of physics and chemistry? In complex systems, emergent
phenomena result from the rule-govemed, non-linear interactions of a large num-
ber of components occurring in a highly context-dependent manner (Box 3). To
come back to the example of consciousness, it arises out of the densely connected
interactions of billions of neurones (and their constituent molecules), and is not a
property of any one brain region, let alone of the neurones themselves. Conscious-
ness is an emergent property of the brain as a whole. Beyond this special case, in-
formation processing can be seen as an emergent property of complex biological
systems in general. One enormous task before us then is the identification of the
rules of the interactions that are embedded in the organisation of living matter.
136 S. B. Nagl eal.

~----~I
Genomics
',,,1,,,,
~----------~
Genome activation patterns:
transcriotomics

Protein population:
Proteomics Molecular

Pathway/network analysis

[ Clinical science Physiology

Organisation:
tissue cel! complex structure
imaging
EM
X-ray, NMR, structural ge-
nomics

integration

2 Systems modelling Data mining

Fig. 3. Analysing and modeling complex cellular systems. The global state of complex cell
systems can be investigate at different levels of description, e.g. genome, proteome, cellular
structure, pathway or network. Novel bioinformatics techniques, notably for systems mod-
eling and data mining, are required to integrate the data obtained into models of complex
cellular systems
Macromolecules, Genomes and Ourselves 137

Box 1.

Retlections on signalling networks


Signalling cascades constitute sequences of biochemical reactions that di-
rectly link signals initiated by receptors at the plasma membrane to spe-
cific cytoskeletal and transcriptional targets for the regulation of ceHular
functional states. These pathways, however, do not function as autono-
mous signal transduction channels, but are interlinked in a complex
branching network. In the nucleus, cytoplasmic signaling cascades are in-
tegrated with other types of signals, such as tissue-specific transcription
factors and local chromosomal structure, by complex cross-talk mecha-
nisms. The multiplicity of interactions suggests a high degree of interde-
pendence between signalling processes for the achievement of a coordi-
nated ceHular response. For example, regulation of the ceH cycle appears
to be achieved by the paraHel processing of the information transmitted
by aH interdependent pathways, and the global state of the network de-
termines the outcome. The pattern of connectivity presumably was re-
peatedly reconfigured during evolution to achieve increasingly complex
levels of regulation. A finite set of regulatory components participate in
every signalling event in the ceH, and the signal processing ability resides
not in any one isolated cascade, but in the total pattern of signalling cre-
ated by the global state of aH network components.
Both the information itself and the key to its interpretation are embed-
ded in the patterns and strength of interactions. Additional factors may
modulate the intensity and duration of the signals. For instance, the ceH-
specific patterns in which numerous components of the network are ex-
pressed can greatly alter the response to upstream and cross-talk signals.
Some signalling events appear to act like binary switches, but in general,
ceHular signalling does not proceed in an all-or-none fashion. A single
input usually has little effect on the state of the ceH, a combination of sig-
nals is normaHy required. Cells receive a constant stream of signals both
from within the cell itself and from the surrounding environment, and in-
formation is most likely carried by the changing pattern and intensities of
aH signalling events combined. This makes it reasonable to model cellular
signalling in terms of graded responses. On the other hand, oncogene
products and their targets might constitute sensitivt' nodes within the in-
tegrated network, capable of profoundly altering the global information
content of the signalling network.
The fact that functional regulation depends at any given time on mes-
sages built up of combinations of signal molecules rather than on isolated
factors raises many questions as to the mechanisms by which cells distin-
guish signals from background noise, and by which they recognize and
interpret a pattern of related signals within the total field of information.
Thus, intracellular signalling could be seen as a system which processes
the information transmitted by alI pathways in a parallel distributed net-
work.
138 S. B. Nagl el al.

Box 2.

Parallel-distributed information processing in coevolutionary net-


works
Earlier work on the ligand-binding domain of steroid receptors showed
that identification of correlated amino acid substitutions in protein do-
main families can be used to investigate the evolution of new functions
[38,551. Steroid, thyroid and retinoid hormones constitute the broadest
class of gene-regulatory signal molecules known. Their receptors belong
to the diverse superfamily of nuclear receptors that function as ligand-
inducible transcription factors. During the evolution of the nuclear recep-
tor superfamily, the ligand-binding domain has evolved to allow binding
of ligands possessing strikingly diverse chemical structures. The ligand is
completely buried within the domain interior and contributes to the hy-
drophobic core of the active conformation of the receptor. Therefore,
structurally diverse Iigands and the Iigand-binding residues combined
need to be able to maintain structural stability and domain dynamics
(conformational changes). How is this potential conflict between struc-
tural constraints and functional diversity in terms of ligand binding re-
solved within the domain fold? Is evolutionary change locally confined to
the ligand-binding pocket, or does it also involve distant coevolving posi-
tions?
A network of coevolving positions that are distributed throughout the
ligand-binding domain was identified by mutual information analysis
(Fig. 4). 72% of coevolving pairs involve positions in the ligand-binding
pocket. 36% of pairs make direct ligand contacts, and a further 36% are
adjacent to ligand contacting positions. This suggests that correlated resi-
due positions are closely associated with the evolution of ligand binding.
Constraint satisfaction within coevolutionary networks can be under-
stood as a form of parallel-distributed information processing and can be
modelled by artificial neural networks. The evolution of new functional
sites within the context of a coevolutionary network can be modelled by a
classical fully-connected feedforward neural network (Fig. 5).The inbuilt
directionality of this type of neural net corresponds to selection pressure
on the domain for evolving new functions.
Macromolecules, Genomes and Ourselves 139

Box 3.

Characteristics of complex systems


Although a formal consensus on the characteristics of complex systems
has yet to emerge, the following characteristics have found general
agreement l56]:
• Complex systems consist of a large number of elements.
• Each element responds only to information that is available to it
locally.
• The elements of a complex system interact in a dynamic fashion and
these interactions change over time.
• The interactions between elements are richly connected - any one
element influences, and is influenced by, a large number of others.
• The interactions between elements are non-linear.
• The interactions between elements are relatively short range.
• There are loops in the interactions.
• Complex systems are usually open systems.
• Complex systems operate under conditions far from equilibrium.
• Complexity emerges as a consequence of the pattems of interaction
between the elements.
• Complex systems have a history.

4.2 Complexity and Post-genomic Biomedicine

Advances in systems biology have enabled us to envisage the possibility of a truly


'personali sed' medicine. New genomic diagnostic techniques are being developed
to provide detailed insight into a patient's molecular make-up, and the functional
states of cells in diseased tissues, by employing genetic data combined with ge-
nome expres sion profiles at the transcript and protein level. These genetic and ge-
nomic data will further need to be integrated with data from molecular medici ne,
physiology, and clinic al investigations, to enable development and delivery of
therapeutic interventions that are closely matched to an individual's characteristics
(Fig. 3). The hoped for gains would be increased efficacy of therapies and a reduc-
tion in side effects. There is also an expectation that increased knowledge of cellu-
Iar systems will lead to the discovery of agreat number of new drug targets. In a
parallel development, biomolecular engineering is fast acquiring the technical
know-how for the design and large-scale manufacture of proteins with novel
therapeutic properties, and these 'designer molecules' will increasingly make up
an important part of the new molecular materia medica.
However, to reap the greatest benefit from these technological advances, they
need to be accompanied by a major conceptual shift. It is crucial to proceed from
an awareness that biological systems are not merely complicated, but that they are
complex. Complex systems properties are likely to give rise to the emergence of
140 S. B. Nag1 et al.

many cellular functions, and therefore are extremely relevant to the study, diagno-
sis and treatment of disease. Consequently, interpretation of the data obtained by
genomic technologies should be sought on this premise. It follows that new com-
putational techniques, able to analyse, model and represent different aspects of
cellular complexity, ought to be a high priority for this new phase of medicine
(Fig. 3).
There has been great excitement in the air, horizons appear to be opening up for
a new way of thinking about biology and medicine, whilst we are also deeply
aware of the huge challenges, both of an empirical and theoretical nature, in bring-
ing this about [38]. A theoretical framework and methodology for the investiga-
tion of complexity and emergence in biology are stilliargely undeveloped. Whilst
the study of complex systems has undergone vigorous expansion over the last dec-
ade, with contributions from a wide range of disciplines, biomedicine has so far
remained almost untouched by these developments. A 'systems vision of life' will
need to apply and extend this knowledge within a biologic al context.

4.3 New Models for Biomedicine:


Ethicallmplications of Model Choice

The vast amount of data generated by post-genomic biomedicine creates an urgent


demand for new approaches able to integrate data from diverse sources into mean-
ingful models of biological systems. With the patient' s benefit as the ultimate aim
in mind, an awareness of the ethical dimension inherent in choosing and develop-
ing models in biomedical science is crucial. Models are created within specific
conceptual frameworks that are not ethically neutral, because they function to re-
inforce or produce specific ways of thinking and acting [39]. One may argue that
increasing realisation of the complex systems properties of biologic al entities
gives rise to a duty to make complexity one focal point of inquiry in biomedical
research. This duty can be derived from the ethical prescriptions of the Hippo-
cratic oath[40, 41]. Such a duty is grounded in the reasonable expectation that the
study of the complex systems properties of biological systems will enable the gain
of new knowledge, and the development of new therapies that are inaccessible
from within other conceptual frameworks. This duty follows from the ethical prin-
ciple of beneficence, i.e., benefiting (future) patients. Whenever there are con se-
quences to human welfare, the duty to treat complex systems as complex systems
is also grounded in the principle of nonmaleficence, or not inflicting harm. A drug
trial, for example, that employs models that are inadequate for detecting effects
due to complex systems properties, may pose great risks to people. A duty to
avoid the use of such models, and to develop altematives that can model complex
systems behaviour, can be easily appreciated.
Macromolecules, Genomes and Ourselves 141

Fig. 4. The coevolutionary network in nuclear receptor Iigand-binding domains. A distrib-


uted network of coevolving positions can be identified by mutual information analysis of a
multiple domain sequence alignment. For iIIustration, the network is shown mapped onto
the retinoic acid receptor X-ray structure (stereo view, Protein Data Bank code 2Ibd.pdb).
Black, ligand-contacting positions (within 4.5 C) (a-carbons, spacefill mode); dark grey,
positions adjacent to Iigand contacts; light grey, covarying positions; white. other covarying
positions (not Iinked to Iigand pocket). The ligand is shown in black (stick mode)

An alternative to a complex systems framework is the conceptualisation of the


human body as a hugely complicated molecular machine. The full impact these al-
ternative representations might have on our understanding of who we are, what it
means to be human, what constitutes a person, is still unknown. Genomic science,
as medical science has always been, is 'as much philosophical as practica!, a mat-
ter of meaning as much as medical intervention' [42]. Furthermore, any represen-
tation one chooses has ethical implications. The machine vision of human biology,
with its concomitant engineering approach to the treatment of disease, predomi-
nates at present, and increasingly determines how we choose to intervene in the
functioning of the body. However, this is just one possible path to take, and a
complex systems framework most likely would result in different modes of inter-
vention . Therefore, any choice, concerning the concepts and models we adopt in
this rush toward a 'new biomedicine', ought to be made in the awareness that
models are never only descriptive tools for knowledge representation, but are also
prescriptive. Keller [43] observed:
Since representations are necessarily structured by language (hence, by cul-
ture), no representation can ever "correspond" to reality. At the same time, some
rcpresentations are c1early better (more effective) than others. In the absence of a
copy of truth, we need to search for the meaning of 'better' in a comparison of the
uses to which different representations can be put, that is, in the practices they fa-
cilitate. From such a perspective, scientific knowledge is value-Iaden (and ines-
capably so) just because it is shaped by our choices - first, of what to seek repre-
sentations of, and second, of what to seek representations for. Far from being
value-free, good science is science that effectively facilitates the material realiza-
tion of particular goals, that does in fact enable us to change the world in particular
ways.
142 S. B. Nagl et al.

Model choice, has an explicit ethical dimension that scientists have a responsi-
bility to confront. A critic al awareness of which kind of knowledge, and which
kind of goals, are likely to be facilitated by the chosen mode of representation
ought to inform any modelling project.

4.4 Models as Metaphoric Constructions

Models are in a certain sense metaphorical constructions [44]. As such, they carry
with them not only explicit messages but also implicit content. A model is a de-
vice for seeing the world in a particular way. A well-developed scientific model
accumulates a complicated assortment of techniques, interpretations, standards of
proof, and so on; and may well have a cognitive impact far transcending the origi-
nal context in which it was conceived. Much of this remains unwritten, but is un-
derstood by everyone who has been sociali sed within the research tradition associ-
ated with the model.
Metaphors occupy a central place in scientific discourse. It has become increas-
ingly recognised that metaphors, models, and language in general, play a central
role as the means by which 'raw data' are shaped into scientific concepts. This re-
alisation exposes the multiple layers of mediation between 'nature' and the 'sci-
ence of nature': quantitative and qualitative perceptions, descriptions of those per-
ceptions, choices shaped by beliefs and the need to create contextual meanings,
interpretations, and the representations ehosen in alI those mediation steps [45]. A
substantial body of scholarship (for example [46-48]) has described the ways in
which scientists employ metaphors as means to relate the empiric al world to ab-
stract theories. The powerful role of metaphors becomes even clearer when one
realises that they not only perform this explanatory function, but also serve as the
'basic organizing relation of a paradigm' [39]. In other words, metaphors are the
devices which organize the 'entire constellation of beliefs, values, techniques, and
so on shared by the members of a given [seientific] community' [49].
Furthermore, metaphors perform a vital role in creative thought; note, for ex-
ample, the rhetorical question posed by Gould [50] whether 'any brilliant insight
has ever been won by pure deduction, and not by metaphor or analogy'. Another
essential role of metaphors in scientific innovation, especially in biology, should
not be forgotten either. At the present state of scientific communication, meta-
phors are often the only language tools available if one intends to convey pattems
of relationship as the consequential parameters, rather than reductionist, hierarchi-
cal descriptions of cause and effeet [51]. Biology, which aims to explain phenom-
ena of immense complexity, relies strongly on metaphoric explanations. In fact,
when faced with hugely complex phenomena, a metaphorical explanation with a
high degree of fit can be the most 'realistic' description attainable [52]. Judging
from these observations, models/metaphors ean be said to convey facts, help to
eoncretise abstract theories, structure paradigms, as well as being essential devices
for scientific creativity.
Metaphors also convey cultural messages as well as facilitating communication
in science. In fundamental ways, any kind of representation is linguistically con-
structed; we can only know something about the world and about ourselves
Macromolecules, Genomes and Ourselves 143

through language. For example, on the level of primary data representation, design
decisions about genome databases determine what uses can be made of the data -
what can be compared with what. Further interpretation of the data and the dis-
semination of this information also depend crucially on the medium of language.
On a societal scale, 'science' and 'culture ' continuously create and re-create each
other through language. This traffic of ideas, images, metaphors and theories
about "nature" and 'human nature' is bidirectional. It is at this junction point that
wider conceptual frameworks, often only implicit in the linguistic representations,
exert their constraints by enabling certain representations but not others. In this
way, metaphors as 'culturally inherited and linguistically reinforced concepts' [53]
play a tremendously important role in the ongoing transformation of our views of
reality and ourselves.

INPUTLAYER HIDDEN LAYER OUTPUT LAYER


FUNCTIONAL SITE COEVOLVING POSITlONS

Se/ect;oll for IIew fim ctioll al site

Fig. 5. A neural network model of the evolution of new functional sites in protein domain
families. The network architecture is that of a c1assic feedforward network whose size can
vary depending on the coevolutionary network to be modelled (closed arrows). Sequence
positions (agents) function as fully connected processing elements (squares) . Each agent is
represented as a binary vector (open arrows). Amino acids are encoded as bitstrings. A het-
ero-associative mapping is performed that maps the input vector matri x (agent states in the
functional site) to the output vector matrix that ranges over a different vector space (states
of coevolving agents). After training, the neural network model encodes the state transition
rules of the coevolutionary network

It is obvious that there are choices to be made. To follow Keller [43], what are
we seeking representations of, and what are we seeking representations for? In a
very fundamental way, the choice of the models we work with, and the choice of
language and metaphor in which we express these ideas, constitute ethical choices
about our attitudes and actions toward other persons and the world. But this wilI
be an on-going reflective process. For what is being proposed here is not simply a
replacement of one set of models/metaphors with another, but recognition that sci-
entists are continuing to construct a discourse with wide ranging implications.
Such awareness demands a stance of self-reflectivity in the inquirer herself or
144 S. B. Nagl etal.

himself. This self-reflectivity aims to make transparent the values and motivations
that inform the research matter, process and goals. As we take up the challenges of
creating new concepts and models for post-genomic biomedicine, we would do
well in probing their implicit metaphoric content. As a community, we might con-
sider it worthwhile to critically explore the possible consequences of adopting cer-
tain conceptual frameworks over others from the widest perspective possible, en-
compassing scientific, medical, social, and cultural dimensions.
Post-genomic biomedicine promises to bring about revolutionary changes in di-
agnosis and intervention, accompanied by a profound transformation in our under-
standing of human bodies, health and disease. There are urgent demands for new
modelling methods and new computational tools to speed up the exploitation of
the huge amount of data becoming available. However, as the directions we em-
bark on now most likely will have significant consequences far into the future, one
may wish to resist these pressures to some extent to allow space for reflection be-
yond the immediate pragmatic challenges. Post-genomic biomedicine may benefit
from a new approach to modelling that includes an ongoing exploration of the
many levels of meanings and prescriptions for action that are inherent in the mod-
els we create.

5 Conclusions

Any scientific or other academic discipline can be viewed from either a historical
or more abstract perspective. In the case of molecular and cell biology the histori-
cal roots are in medicine and the study of human disease, classical biology and
biologic al chemistry. Some of these origins are reflected in the current ways in
which the science is practised today. For example classical biology or natural his-
tory was descriptive, partly anecdotal and led to a preoccupation with the idea of
classification that still dominates aspects of biochemistry (Sec. 2.3). There was an
idea in the relatively early days of molecular biology that, in some sense, the task
of the molecular biologist was to reduce complex biological phenomena to phys-
ico-chemical descriptions. However (Sec. 4.2) currently the complexity of bio-
logical systems is seen differently.
Of several approaches to studying the complexity of natural systems, the semi-
otics of Chandler [54], highlights the new properties that emerge in the progress
from the classes of Subatomic particles, through Atoms, Molecules, Biomacro-
molecules, Cells, Ecoment to Environment. (The words in italics are Chandler's
own names for these classes.) A modem view of biology is that properties includ-
ing reproduction, adaptation, leaming and information processing emerge with in-
creasing complexity. It is beyond dispute that the Environment consists of Sub-
atomic particles but there is no feasible method whereby a detailed study of such
particles could predict the human behaviour that has led to global warming, the
nature of the remarkable population of types of fish in Lake Victoria ....
Computing and biology have a relationship which biologists might regard as an
example of mutualism. Molecular biology would be impossible without computa-
tional analysis of data (Sec. 2) and ideas derived from theoretical considerations
Macromolecules, Genomes and Ourselves 145

provide valuable metaphors for both molecules (Sec. 3) and complex cellular sys-
tems (Sec. 4). An integrative description is at present difficult for reasons stated at
the end section of 3.2 although it is arguable that those practising bioinformatics
might make a major contribution by addressing this point.
Our final point is to comment on the biology-to-computing component of the
mutualistic relationship. We argue that the emergence of properties represented by
the system of Chandler [54] or, altematively, as the process of evolution should
provide novel metaphors in computing. The criterion here is simple but strict: if
the emergence of biological complexity provides useful algorithms, the develop-
ment of such algorithms can become independent of their inspiration. A geneticist
faced with a description of a genetic algorithm (or genetic program) might com-
ment critically on the fact that many facets of actual genetic change are either
omitted or misunderstood. That is irrelevant if the algorithm is efficient and use-
fuI. Rigorous and robust methods inspired by the development of biological com-
plexity can justify themselves and, even if they deviate from the facts of their in-
spiration, should facilitate the analysis of the emergence of properties in more
complex systems.

References

l. Parish, J. H. (1999) The language of proteins. In Visual Representations and Interpre-


tations, (eds. Paton, R. & Neilson, 1.), Springer, Berlin Heidelberg new York, pp. 139-
145.
2. Dyson, H. J., Merutka, G., Waltho, J. P., Lemer, R. A. & Wright, P. E. (1992a). Fold-
ing of peptide fragments comprising the complete sequence of proteins. Models for
initiation of protein folding. 1. Myohemerythrin. Joumal of Molecular Biology 226,
795-817.
3. Dyson, H. J., Sayre, J. R., Merutka, G., Shin, H. C., Lemer, R. A. & Wright, P. E.
(l992b). Folding of peptide fragments comprising the complete sequence of proteins.
Models for initiation of protein folding. II. Plastocyanin. Joumal of Molecular Biology
226,819-35.
4. Varley, P., Gronenbom, A. M., Christensen, H., Wingfield, P. T., Pain, R. H. & Clore,
G. M. (1993). Kinetics of folding of the all-beta sheet protein interleukin-l beta. Sci-
ence 260, 111O-1113.
5. Wright, P. E., Dyson, H. J. & Lemer, R. A. (1988). Conformation of peptide fragments
of proteins in aqueous solution: implications for initiation of protein folding. Biochem-
istry 27,7167-75.
6. Maturana, H. R. and Varela F. J. (1992). The Tree of Knowledge, Shambhala Publica-
tions.
7. Barker, W. c., Garavelli, J. S., Huang, H. Z., McGarvey, P. B., Orcutt, B. c., Sriniva-
sarao, G. Y., Xiao, C. L., Yeh, L. S. L., Ledley, R. S., Janda, J. F., Pfeiffer, F., Mewes,
H. W., Tsugita, A. & Wu, C. (2000). The Protein Information Resource (PIR). Nuc/eic
Acids Research 28, 41-44.
8. Namboodiri, K., Pattabiraman, N., Lowrey, A., Gaber, B., George, D. G. & Barker, W.
C. (1990). Nrl-3d - a Sequence--Structure Database. Biophysical Joumal 57, A406-
A406.
146 S. B. Nagl et al.

9. Berman, H. M., Westbrook, 1., Feng, Z., Gilliland, G., Bhat, T. N., Weissig, H., Shind-
yalov, 1. N. & Bourne, P. E. (2000). The Protein Data Bank. Nucleic Acids Research
28, 235-242.
10. Bairoch, A. & Apweiler, R (2000). The SWISS-PROT protein sequence database and
its supplement TrEMBL in 2000. Nucleic Acids Research 28, 45-48.
11. Lo Conte, L., Ailey, B., Hubbard, T. J. P., Brenner, S. E., Murzin, A. G. & Chothia, C.
(2000). SCOP: a Structural Classification of Proteins database. Nucleic Acids Re-
search 28,257-259.
12. Orengo, C. A., Michie, A. D., Jones, S., Jones, D. T., Swindells, M. B. & Thomton, J.
M. (1997). CATH - a hierarchic classification of protein domain structures. Structure
5, 1093-1lO8.
13. Holm, L. & Sander, C. (1996). The FSSP database: Fold classification based on struc-
ture structure alignment ofproteins. Nucleic Acids Research 24,206-209.
14. Hofmann, K., Bucher, P., Falquet, L. & Bairoch, A. (1999). The PROSITE database,
its status in 1999. Nucleic Acids Research 27, 215-219.
15. Attwood, T. K., Flower, D. R., Lewis, A. P., Mabey, J. E., Morgan, S. R, Scordis, P.,
Selley,1. N. & Wright, W. (1999). PRlNTS prepares for the new millennium. Nucleic
Acids Research 27,220-225.
16. Bateman, A., Bimey, E., Durbin, R, Eddy, S. R, Howe, K. L. & Sonnhammer, E. L.
L. (2000). The Pfam protein families database. Nucleic Acids Research 28, 263-266.
17. Michie, A. D., Orengo, C. A. & Thomton, J. M. (1996). Analysis of domain structural
class using an automated class assignment protocol. Journal of Molecular Biology
262, 168-185.
18. Russell, R B., Copley, R R & Barton, G. 1. (1996). Protein fold recognition by map-
ping predicted secondary structures. Journal of Molecular Biology 259,349-365.
19. Wamer, G.J, Ison, J.c. & Parish, J.H., (1998). "Protein fold recognition from secon-
dary structure signatures", CCPll Newsletter Issue 6 2.4. Available as
www.hgmp.mrc.ac.uldCCP111newsletter/voI2_4lccpll_article/fuICarticle.html
20. Zhang, C. T. & Zhang, R (1998). A new criterion to classify globular proteins based
on their secondary structure contents. Bioinjormatics 14, 857-865.
21. Zhang, C. T. & Zhang, R (1999). A quadratic discriminant analysis of protein struc-
ture classification based on the helix/strand content. Journal of Theoretical Biology
201,189-199.
22. Daniel, S.C., Parish, J.R., Ison, J.c., Blades, M.J. & Findlay, J.B.C. (1999) Alignment
of a sparse protein signature with protein sequences: application to fold prediction for
three small globulins. FEBS Letters 459, 349-52.
23. Changeux, J. (1965) The control of biochemical reactions. Scientific American 214,
36-45.
24. Paton, RC., Staniford, G. and Kendall, G. (1996) Specifying logical agents in cellular
heirarchies. In Computation in cellular and molecular biological systems (eds.
Cuthbertson, R, Holcombe, M. & Paton, RC.), World Scientific: Singapore, pp. 105-
119.
25. Albrecht-Buehler, G. (1990), In Defense on 'Non moleculuar' Cell Biology, Interna-
tional Review of Cytology, 120, 191-241.
26. Paton, RC. (1997), Glue, Verb and Text Metaphors in Biology, Acta Biotheoretica,
45,1-15.
27. Paton, RC. & Matsuno, K. (1998), Some common yhemes for enzymes and verbs,
Acta Biotheoretica, 46,131-140.
Macromolecules, Genomes and Ourselves 147

28. Fillmore, C.1. (1968), The Case for Case, In Universals in Linguistic Theory (eds.
Bach, E. & Harms, R.T.), HoIt, Rinehart & Winston, pp. 1-88.
29. Welch, G.R. & Kell, D.B. (1986). Not just Catalysts - Molecular Machines in Bioener-
getics. In The Fluctuating Enzyme (ed. Welch, G.R.), John Wiley, pp. 451-492.
30. Conrad, M. (1992), The seed germination model of enzyme catalysis, BioSystems 27,
223-233.
31. Uyeda, K. (1992), "Interactions of Glycolytic Enzymes with Cell Membrane",
Curent. Topics in Cell Regulation 33,31-46.
32. Pilkis, S.1., Claus, T.H., Kurland, 1.J. & Lange, AJ. 1995, "6-Phosphofructo-2-kinasel
Fructose-2,6-Bisphosphatase: A metabolic signalling enzyme", Annu. Rev. Biochem.,
64,799-835.
33. Giles, R.H., Peters, D.1.M. & Breuning, M.H. (1998), Conjunction dysfunction:
CBP/p300 in human disease, Trends in Genetics, 14,178-183.
34. Blohm, D.H. & Guiseppi-Elie, A (2001). New developments in microarray technol-
ogy. Current Opinion in Biotechnology 12, 41-47.
35. Williams, K. L. (1999). Genomes and proteomes: Towards a multidimensional view of
biology. Electrophoresis 20, 678-688.
36. Pandey, A & Mann, M. (2000). Proteomics to study genes and genomes. Nature 405,
837-846.
37. Rain, J.-C. Selig, L., De Reuse, H., Battaglia, V., Reverdy, C., Simon, S., Lenzen, G.,
Petel, F., Wojcik, J., Schachter, V., Chemama, Y., Labigne, A & Legrain, P. (2001).
The protein-protein interaction map of Helicobacter pylori. Nature 409, 211-215.
38. Nagl, S. B. (20ooc). Protein evolution as a parallel-distributed process: A novel ap-
proach to evolutionary modelling and protein design. Complex Systems 12:261-280.
39. Nagl, S. B. (2oo0a). Science and moral agency in a complex world. Paper presented at
the 5th World Congress of Bioethics, Imperial College, London, September 2000.
40. Nagl, S. B. (2000b). Neural network models of protein domain evolution. HYLE-An
International Journal for the Philosophy of Chemistry 6:143-159. An electronic ver-
sion of the paper is available at
41. http://www.uni-karlsruhe.de/-edOllHylelHyle6/nagl.htm.
42. Beauchamp, T. L. and Childress, J. F. (1989). Principles of Biomedical Ethics, Oxford
University Press, Oxford, p. 120.
43. Kemp, M. and Wallace, M. (2000). Spectacular Bodies: The Art and Science of the
Human Body from Leonardo to Now. Hayward Gallery Publishing, p. 24.
44. Keller, H. (1992). Secrets oflife, secrets ofdeath, Routledge, p.5.
45. Holland, J. H. (1998). Emergence, Addison-Wesley, New York. p. 207.
46. Spanier, B. B. (1995) ImlPartial Science, Indiana University Press.
47. Leatherdale, W H. (1974) The Role of Analogy, Model and Metaphor in Science, North
Holland.
48. MacCormac, E R. (1976). Metaphor and Myth in Science and Religion, Duke Univer-
sity Press.
49. Hesse, M. (1996). Models and Analogies in Science, University of Notre Dame Press.
50. Kuhn, T. S. (1970) The Structures of Scientific Revolutions 2nd edition, University of
Chicago Press, Chicago, Ill., p. 175.
51. Gould, S. J. (1996) Why Darwin? The New York Review of Books 4 April, 1996: 10-
14, p. 10.
148 S. B. Nagl et al.

52. Thaler, D. S. (1996). Paradox as path: pattern as map. In The Philosophy and History
of Molecular Biology: New Perspectives, (ed. Sahotra Sarkar), Kluwer, Dordrecht, pp.
233-248.
53. Depew, D. J. & Weber, B. H. (1996). Darwinism Evolving: Systems Dynamics and the
Genealogy of Natural Selection. First paperback edition, MIT Press, Cambridge,
Mass." pp. 21-30.
54. Margulis, L. and Sagan, D. (1995). What is Life?, Weidenfeld & Nicolson Ltd, p. 41.
55. Chandler, J.L.R. (1996) Complexity ill. Emergence. Notation and symbolization.
WESScomm 2,34-37.
56. Nagl, S. B. (2001). Can correlated mutations in protein domain families be used for
protein design? Briejings in Bioinformatics. In press.
57. Cilliers, P. (1998). Complexity & Postmodernism, Routledge, p. 3.
Models of Genetic Regulatory Networks

H. Bolouri, M. Schilstra

Science and Technology Research Centre, University of Hertfordshire, ALlO


9AB, UK

Hamid Bolouri
Institute for Systems Biology, Seattle, WA 98103-8904, USA
Division of Biology 156-29, California Institute of Technology, CA91125, USA
H.Bolouri@herts.ac.uk

Abstract. This chapter provides a short review of the modelling of Genetic Regu-
latory Networks (GRNs). GRNs have a basic requirement to model (at least) some
parts of a biological system using some kind of logic al formalism. They represent
the set of alI interactions among genes and their products for determining the tem-
poral and spatial patterns of expres sion of a set of genes. The origin of modelling
the regulation of gene expression goes back to the Nobel-prize winning work of
Lwoff, Jacob and Monod on the mechanisms underlying the behaviour of bacte-
rial viruses that switch between so-called lytic and lysogenic states. Some of the
circuit-based approaches to GRNs such as the work of Kauffman, Thomas, and
Shapiro and Adam are discussed.

1 What are Genetic Regulatory Networks?

A Genetic Regulatory Network (GRN) is simply the set of all int~ractions among
genes and their products determining the temporal and :,patial patterns of expres-
sion of a set of genes. Figure 1 is a cartoon of some of the major interactions that
may be included in a GRN. These include: mRNA transcription, transport out of
(and later into) the cell nucleus, protein synthesis, and protein-protein interactions
within the cytoplasm (including interactions with signaling pathways). Since
genes underlie all aspects of evolution, development and physiology, the topic as
a whole is clearly well beyond the scope of this chapter and its authors! Luckily,
there are numerous very good textbooks that cover specific aspects of GRNs.
The biology of gene structure, regulation and function is beautifully described
in a number of textbooks, e.g. [1,2]. The significance, operation and evolution of
genetic regulatory networks are admirably described in recent books by Eric
Davidson [3] and Sean Carroll et al. [4] For the role of GRNs in development, two
classic textbooks are Scott Gilbert's Developmental Biology [5] and Cells, em-
bryos, and evolution, by John Gerhart and Marc Kirschner [6].
150 H. Bolouri, M. Schilstra

The origins of modeling the regulation of gene expression goes back to the
Nobel-prize winning work of Lwoff, Jacob and Monod on the mechanisms under-
lying the behaviour of temperate bacteriophages, bacterial viruses that switch be-
tween so-called lytic and lysogenic states [7-9]. Mark Ptashne's book [10] gives
an excellent review of the lytic-Iysogenic behavior of bacteriophage lambda, ar-
guably the most studied GRN. Some more recent models of GRNs and the ac-
companying theoretical developments are described in Chaps. 1 to 5 of [11], and
an extensive review of the current literature in this field may be found in [12].
For the rest of this chapter, we will assume the reader is familiar with the basic
biology of gene regulation and concentrate on specific issues of interest to model-
ers. However, to avoid confusion, in the following subsections we first briefly
outline a few basic concepts and facts.

./. ~I:;;;. \. 'ti"


I Ribosome I
Nuclear
membrane

~~
l (,"""
. Gen Cell surface

G,~,m
receptor
populations
( ' '

•.••.••.
. ~ ......ucleu
, ....
........
.'
+J
Ligand
Cell molecule
\VaII Cytoplas

Fig. 1. Interactions and processes that play a role in GRNs

2 What is a Gene?

Over the years, the definition of what constitutes a gene has gradually changed.
Unfortunately, simple definitions like 'one gene, one protein' tend to confuse
rather than help. A commonly accepted definition of a gene is all the DNA se-
quence that is necessary and sufficient to account for the experimentally observed
pattern of production of a specific protein in vivo. Note that this definition implies
there are two aspects to a gene. Firstly, a gene encodes a particular protein. Sec-
ondly, a gene includes any DNA sequence that affects its pattern of expression,
i.e. what cell types it is expressed in, how much, and at what times. Thus, the
Models of Genetic Regulatory Networks 151

DNA encoding a single gene may be divided into two parts: the protein coding
sequence, and the regulatory sequence.

3 Regulation of Single Genes

The process by which genes produce proteins involves many steps, and regulatory
interactions can influence the rate and product of each step. The regulation of
transcription, the process by which DNA sequence is 'read' and transcribed into
messenger RNA (mRNA), is probably the most prevalent form of genetic regula-
tion. For the rest of this chapter we will focus on transcriptional control of gene
activity. But mRNA splicing, editing, transport and localization, translation, and
protein modifications alI contribute to the concentration and variety of any protein
produced by a gene and should not be forgotten. Fig. 2 is a cartoon example of
how multiple proteins can interact to regulate gene transcription. Different combi-
nations of proteins can bind to different regulatory sequences in a gene and acti-
vate or repress transcription depending on the history of a cell and extra celIular
signals (see books referred to earlier).

4 Differences in Gene Regulation Between Organisms

The genomes of more than 45 species have now been sequenced and some 90
more are about to be completed 1. Analysis of these genomes confirms some gen-
eral trends. For example, the genomes of most prokaryotes (single celled organ-
isms without a nucleus, such as bacteria) and cell organelles (such as mitochon-
dria and chloroplasts) are organized in a single, circular DNA. This appears to put
significant constraints on the overall size of the genome. The relatively short gen-
eration time of prokaryotes, and their very long evolutionary history, combine to
make their genomes compact and extremely well optimized. Gene expression in
prokaryotes is usually controlled from a relatively short stretch of DNA just be-
fore the start of the coding sequence, the promoter, to which repressor or activator
proteins can bind. A single promoter may control the transcription of more than
one protein, and sometimes the coding sequences of more than one protein over-
lap.
On the other hand, the genes of eukaryotes (organisms in which alI of the ge-
nornic DNA is contained inside a nucleus) generally have large regulatory se-
quences. The regulatory sequences can be thousands of nucleotides away from
their coding sequences. Transcription factors, the regulatory proteins that bind to
these sequences, control the rate at which transcription is initiated [1]. In simple
single celled eukaryotes, such as yeast, a single transcription factor may some-

I http://www.ncbi.nlm.nih.govIPMGifs/Genomeslbact.html
152 H. Bolouri, M. Schilstra

times be sufficient to initiate transcription, but, in general, the more complex the
organism is, the greater the average number of transcription factors that are in-
volved in the regulation of a single gene.

Transcription

(a)1
11\----
I I I ţ (c)
Regulatory Coding
region region

A single gene may be (in)activated by several The complex enables


differenl protein combinations transcription

(d)

Activaling Transcription Faclors form


complex attached 10 regulatory DNA RNAP copies DNA to mRNA

Fig. 2. Multiple proteins interacting to regulate gene transcription

5 Modeling GRNs

Some genes (such as those encoding heat shock proteins) are present in large
numbers within the genome. But for the most part, there are just one or two active
copies of each gene per cell. So, even when activators, repressors, or transcription
fac tors are present in large numbers, transcription is fundamentally stochastic. For
some genetic circuits, such as the lambda-phage Iysis-Iysogeny decision circuit
[13] and circuits involving invers ion of DNA segments [14,15], this randomness
appears to be important. It allows cells to exhibit multiple alternative responses to
given conditions. For the most part though, one can average out the randomness
over time. In that case, the rate of change in activity for a gene can be represented
by ordinary differential equations (ODEs). Such time-averaged representations of
gene regulation are simplified by assuming that parts of the system are at steady
state or equilibrium on the time scale of mRNA production [16-18]. This lends it-
self readily to model ling GRNs as continuous sigmoids and autonomous differen-
tiai equations [19]. A further simplification uses 'sum of products' polynomial
functions to describe gene activity ovcr time (see [20]). Further simplifications
Models of Genetic Regulatory Networks 153

use kinetic, qualitative, and Boolean logic models (for a review of the underlying
theoretical issues, see [21-25], for example models see [26-30]).
An example of a logical model of a GRN is shown in Figs. 3 and 4. These
figures are taken from an ongoing collaboration between our group at the Univer-
sity of Hertfordshire and the laboratory of Prof. Eric Davidson at the California
Institute of Technology. The diagrams show a set of interactions among families
of genes inferred from celIular experiments. The model is intended to represent
the necessary and sufficient interactions for endo-mesoderm specification in sea
urchins. A fuller description of the model is beyond the scope of this chapter.
Here, we present the diagrams as examples with which we can give meaning to
some basic concepts. In both figures, gene families are represented by horizontal
lines crossed with a 90° arrow. The coloured lines above the genes represent inter-
actions among gene products (proteins). Where such a line is incident to a gene
symbol, it indicates a regulatory interaction between the protein and the gene. The
manner in which regulatory proteins interact with each other and the regulatory
DNA of the gene they are incident on, is represented below the gene symbols. Ar-
rows indicate positive (excitatory) inputs to the system while bars indicate repres-
sive inputs. The regulatory interactions (in this case, logical) are represented by
the functions indicated in the circles. Fig. 3 shows the 'view from the genome'
[31], i.e. the set of alI interactions possible among the modelled components. In
any one celI, and at any one time, only a subset of these interactions will actually
be taking place. The active interactions are represented by the coloured lines in
Fig. 4, whereas the grey lines represent inactive interactions. Fig. 4 is an example
of what Amone and Davidson [31] call the 'view from the nucleus' .

6 Some GRN Models to Date

The history of GRN modelling goes back a long way. For example, in 1971, Stu-
art Kauffman started describing GRNs as Boolean networks [32], and in 1973
Rene Thomas delineated the principles of kinetic logic. Here, we outline only
some of the major trends in modelling over the past 20 years. This is not, by any
means, a comprehensive review!
Prokaryotic systems, and in particular the lysis-lysogeny switch of bacterio-
phage lambda have been studied extensively, and are probably the best place to
start for those new to the field; see for example [16, 17,26,27]. McAdams and
Arkin [33], as well as Gibson & Bruck [34] have shown that stochastic effects
play a fundamental role in the regulation of lambda phage activity (see [35] for an
earlier example of analysis of stochastic effects on gene activity). For a model of
another bacteriophage GRN, see [36]. We previously mentioned the important
work of Thomas and colleagues (e.g. [23]) in formulating a discrete logic formal-
ism ('kinetic logic') for GRNs. This formalism has been used to model a number
of genetic regulatory systems, both prokaryotic and eukaryotic (see [37] and ref-
erences therein).
154 H. Bolouri, M. Schilstra

G K· Li ejl MVLjlA Late signal rrom


s v'~--------------~

late signal from


-7'" eleavage
micromcres

OT
X

EmJo-mcs...... " [lIJo-


Gcnl.:s ........--- :"o(>\!CltiC
Gcnl!~

Fig. 3. Sea urchin endo-mesoderm GRN: view from the genome

An important contribution to future simulations of large-scale GRNs comes


from Shimada et al. [38] who used the lambda-phage GRN to demonstrate the
feasibility of GRN simulation using multiple levels (resolutions) of abstraction.
In a groundbreaking demonstration that GRNs are similar to mixed (analog-
digital) microelectronic circuits, and can therefore be modelled, simulated and
analysed with similar tools, McAdams and Shapiro [39] used the microelectronic
circuit simulator Spice to model the bacteriophage lambda GRN. Another impor-
tant contribution to the area has been the pioneering use of neural networks that
adapt their structure to model GRNs by Eric Mjolsness and colleagues, see [40].
Note that Mjolness and colleagues start with fully connected networks that are
pruned as a result of the adaptation (training) process, so their models are capable
of representing any set of interactions between genes. Other applications of neural
networks that use fixed network structures (e.g. [41]) are less computationally
intensive, but do not have the same representational power. Mjolness and
colleagues initially used neurons with simple weighted-sum-of-inputs activation
functions, but have recently extended these to neurons with higher order activa-
tion functions (see [20]).
Models of Genetic Regulatory Networks 155

Fig. 4. Sea urchin endo-mesoderrn GRN: view from the nucleus (view in one group of cells
at a particular developmental time)

7 GRN Simulators

Unfortunately, there are currently no simulators with facilities for the range of
GRN representations discussed here . We are currently developing such a resource
that we hope to make public shortly (see http://strc.herts.ac.uklbiolMaria/NetBuilder/-
index.htm).
One of the earliest publicly available simulators with explicit representations
for gene regulation was MetaSim developed by Hans Westerhoff and colleagues
[42]. Under development is Gene-O-Matic, a package, aimed specifically at mod-
elling genetic networks in a multi-cellular context (PI Ute Platzer, German Can-
cer Research Center, Heidelberg, http://mbi.dkfz-heidelberg.de/mbi/research/cellsirn!-
net/index.shtml). A number of knowledge-based simulators that describe GRNs in
terms of predicate logic have also been developed: MolGen [43] , Genesis [44],
GenSim/HepyGene [45]. Of course, one can use general biochemical network
simulators such as
• DBSolve«PI Igor Goryanin, Glaxo-SmithKline, UK
http://websites.ntl.com/-igor.goryanin
• E-Cell (PI Masaru Tomita, Keio Univ., Japan, http://www.e-cell.org/)
• V-Cell (National Resource for Cell Analysis and Modelling,
http://www.nrcam.uchc.edu/index.html )
• Gepasi (PI Pedro Mendes, Virginia Tech http://www.ncgr.org/software/gepasi/)
156 H. Balauri, M. Schilstra

• BioQUEST (PI Brian White, University of Maryland,


http://omega.cc.umb.edu/-bwhite/ek.html )
• ProMotIDiva (PI Martin Ginkel, Max-Planck-Institute Magdeburg,
http://www.mpi-magdeburg.mpg.de/research/projecCa/pro_a4/promot.html )
• Jamac (PI Herbert Sauro, Caltech, http://www.cds.caltech.edul-hsauro(and
http://members.tripod.co.uk/sauro/biotech.htm )
to develop one's own GRN libraries.

8 Uses of GRNs Beyond Biology

The formulation of the regulatory function of genes as a 'sum of products' poly-


nomial (see also Figs. 3 and 4) highlights the similarity of the regulatory region of
a gene to the computational graphs that represent gene function in Genetic Pro-
gramming [46]. GRNs have been used to evolve mobile robots and other 'agents'
in Artificial Life [47, 48]. A number of researchers, including Alistair Rust and
colleagues in our group have used GRNs to mimic neural development as part of
a methodology to automatically evolve functionally useful artificial neural net-
works [49-51]. The use of GRN s in artificial evolved systems is still in its infancy
and many obstacles associated with the appropriate design of cost functions and
modelling of the structural and behavioural properties of DNA and proteins pre-
sent exciting and challenging research opportunities.
Another novel and promising area of research is the design of de novo GRNs,
either as modifications of existing circuits in host organisms [52], or insertion of
new genes into hosts [53]. Altogether, we can safely say 'the fun is just begin-
ning'!

Acknowledgements. We thank Mark Robinson for his careful and constructive


review of our manuscript. Research in our group is funded by grants from the UK
BBSRC and EPSRC, the Wolfson Foundation, the US NIH, and the Japan Science
and Technology Corp.

References

Alberts, B., et al., Molecular biology ofthe ceU. 3 ed. 1994, New York: Garland Pub-
Iishing.
2 Lewin, B., Genes VII. 1999, Oxford: Oxford University Press.
3 Davidson, E.H., Genomic regulatory systems. 2001, San Diego Calif.: Academic Press.
4 Carroll, S.B., J.K. Grenier, and S. D. Weatherbee, From DNA to diversity : molecular
genetics and the evolution of animal design. 2001, Cambridge MA: Blackwell Scienee
Ine.
5 Gilbert, S.F., Developmental biology. 2000: Sunderland, Mass: Sinauer Associates Inc.
Models of Genetic Regulatory Networks 157

6 Gerhart, J. and M. Kirschner, Cells, embryos, and evolution. 1997, Malden Mass:
Blackwell Science. 642.
7 Lwoff, A., Lysogeny. Bacteriol. Rev., 1953. 17: p. 269-337.
8 Jacob, F. and J.L. Monod, Genetic regulatory mechanisms in the synthesis of proteins.
J. Mo!. Biol., 1961. 3: p. 318-356.
9 Jacob, F. and 1. Monod, On the regulation of gene activity. Cold Spring Harb. Symp.
Quant.Biol., 1961. 26: p. 193-211,389-401.
IO Ptashne, M., A genetic switch. Phage Lambda and higher organisms. 2 ed. 1992, Ox-
ford: Blackwell Scientific.
11 Bower, J.M. and H. Bolouri, eds. Computational modelling of genetic and biochemical
networks. Computational molecular biology, ed. S. Istrail, P. Pevzner, and M. Water-
man. 2001, MIT Press: Cambridge Mass.
12 Smolen, P., D.A. Baxter, and J.H. Byme, Modelling transcriptional control in gene
networks - Wethods, recent results, andfuture. Bull. Math. Biol., 2000. 62: p. 247-292.
13 Arkin, A., J. Ross, and H.H. McAdams, Stochastic kinetic analysis of developmental
pathway bifurcation in phage lambda-infected Escherichia coli ce lis. Genetics, 1998.
149: p. 1633-1648.
14 Van De Putte, P. and N. Goosen, DNA inversions in phages and bacteria.DNA inver-
sions in phages and bacteria. Trends Genet., 1992.8: p. 457-462.
15 McAdams, H.H. and A. Arkin, It's a noisy business! Genetic regulation at the nanomo-
Iar scale. Trends Genet., 1999(15): p. 65-69.
16 Johnson, A.D., et al., Â repressor and cro-components of an efficient molecular switch.
Nature, 1981. 294: p. 217-223.
17 Ackers, G.K., A.D. Johnson, and M.A. Shea, Quantitative model for gene regulation
by Âphage repressor. Proc. Natl. Acad. Sci. USA, 1982.79: p. 1129-1133.
18 Keller, A.D., Model genetic circuits encoding autoregulatory transcription factors. 1.
TheoL Bio!., 1995.172: p. 169-185.
19 Mestl, T., E. Plahte, and S.W. Omholt, A mathematicalframeworkfor describing and
analysing gene regulatory networks. J. TheoL Biol., 1995. 176: p. 291-300.
20 Gibson, M. and E. Mjolsness, Modelling the activity of single genes., in Computational
modelling of genetic and biochemical networks., J.M. Bower and H. Bolouri, Editors.
2001, MIT Press: Cambridge, Mass. p. 1-48.
21 Snoussi, E.H., Qualitative dynamics of piecewise-linear dijferential equations: a dis-
crete mapping approach. Dynamics and Stabi'.ity of Systems, 1989. 4(3 & 4): p. 189-
207.
22 Thomas, R. and R. D'Ari, Biologicalfeedback. 1990, Boca Raton Fla. CRC Press.
23 Thomas, R., D. Thieffry, and M. Kaufman, Dynamical behaviour of biological regula-
tory networks. 1. Biological role of feedback loops and practical use of the concept of
the loop-characteristic state. Bull. Math. Biol., 1995. 57(2): p. 247-276.
24 Kauffman, S.A., The origins of order. Self-organizarion and selection in evolution.
1993, New York NY: Oxford University Press.
25 Akutsu, T., S. Miyano, and S. Kuhara, lnferring qualitative relations in genetic net-
works and metabo/ic pathways. Bioinformatics, 2000. 16(8): p. 727-734.
26 Shea, M.A. and G.K. Ackers, The 0R control system of bacteriophage lambda. A
physical-chemical model for gene regulation. J. MoI. Bio!., 1985. 181: p. 211-230.
27 Thieffry, D. and R. Thomas, Dynamical behaviour of biological regulatory networks.
Il. lmmunity control in bacteriophage lambda. Bull. Math. Bio!., 1995. 57(2): p. 277-
297.
158 H. Bolouri, M. Schilstra

28 Thieffry, D., et al., From specific gene regulation to genomic networks: a global
analysis of transcriptional regulation in Escherichia coli. BioEssays, 1998. 20(5): p.
433-440.
29 Yuh, C.-H., H. Bolouri, and E.H. Davidson, Genomic cis-regulatory logic: experimen-
tal and computational analysis of asea urchin gene. Science, 1998. 279: p. 1896-1902.
30 Yuh, C.-H., H. Bolouri, and E.H. Davidson, Cis-regulatory logic in the endo16 gene:
switching from a specification to a differntiation mode of control. Development, 200l.
128: p. 617-629.
31 Amone, M.I. and E.H. Davidson, The hardwiring of development: Organization and
function of genomic regulatory systems. Development, 1997. 124( 10): p. 1851-1864.
32 Kauffman, S.A., Gene regulation networks: a theory for their global structure and be-
havior. Current Topics in Dev. Biol., 1969.6(145).
33 McAdams, H.H. and A. Arkin, Stochastic mechanisms in gene expression. Proc. Natl.
Acad. Sci. USA, 1997.94: p. 814-819.
34 Gibson, M.A. and J. Bruck, Efficient stochastic simulation of chemical systems with
many species and many channels. J. Phys. Chem. A, 2000. 104: p. 1876-1889.
35 Băttner, R., K. Bellmann, and T. Cierzynski, A discrete stochastic simulation model of
the regulation of gene expression with variable control characteristics., in Molecular
genetic information systems: modelling and simulation., K. Bellmann, Editor. 1983,
Academic-Verlag: Berlin.
36 Endy, D., D. Kong, and J.Y. J, lntracellular kinetics of a growing virus: A geneticaliy
structured simulation for bacteriophage 17. Biotechnol. Bioeng., 1997. 55(2): p. 375-
389.
37 Thieffry, D. and R. Thomas. Qualitative analysis of gene networks. in Pacific Sympo-
sium on Biocomputing. 1998. Maui, Hawai, USA: World Scientific.
38 Shimada, T., et al. Knowledge-based simulation of regulatory action in lambda phage.
in 1st IEEE lnt. Symposium on lntelligence in Neural and Biological Systems
(INBS'95). 1995.
39 McAdams, H.H. and L. Shapiro, Circuit simulation of genetic networks. Science, 1995.
269(5224): p. 650-656.
40 Reinitz, J., E. Mjolsness, and D.H. Sharp, Model for cooperative control of positional
information in Drosophila by Bicoid and Maternal Hunchback. J. Exp. Zool., 1995.
271: p. 47-56.
41 Nair, T.M., S.S. Tambe, and B.D. Ku1kami, Analysis of transcription control signals
using artificial neural networks. Computer Applications in the Biosciences, 1995.
11(3): p. 293-300.
42 Stoffers, H.J., et al., METASIM: object-oriented model/ing of celi regulation. Com-
puter Applications in Biosciences., 1992.8(5): p. 442-449.
43 Meyers, S. and P. Friedland, Knowledge-based simulation of genetic regulation in bac-
teriophage lambda. Nucleic Acids Res., 1984. 12( 1): p. 1-9.
44 Friedland, P., et al., GENESIS, a knowledge-based genetic enigeering simulation sys-
tem for representation of genetic data and experiment planning. Nucleic Acids Res.,
1982.10(\): p. 323-340.
45 Karp, P.D., Artificial-intelligence methods for theory representation and hypothesis
formation. Comput. Appl. Biosci., 1991. 7(3): p. 301-308.
46 Koza, J., Genetic programming: on the programming of computers by means of natu-
ral selection (complex adaptive systems). 1992, Cambridge MA: MIT Press. 819.
Models of Genetic Regulatory Networks 159

47 Dellaert, F. and R.D. Beer. Toward an evolvable model of development for autono-
mous agent synthesis. in Artificial Life IV: 4th Int. Workshop on the Synthesis and
Simulation of Living Systems. 1994: MIT Press, Cambridge, Mass.
48 Gallagher, J.c. and RD. Beer. Evolution and analysis of dynamical neural networks
for agents integrating vision, locomotion and short-term memory. in Genetic and Evo-
lutionary Computation Conference (GECCO-99). 1999. Orlando, Fla.
49 Rust, A.G., et al. Developmental Evolution of an Edge Detecting Retina. in 8th Inter-
national Conference on Artificial Neural Networks (ICANN'98). 1998: Springer, Berlin
Heidelberg New York.
50 Rust, A.G. and R Adams. Deve/opmental Evolution of Dendritic Morphology in a
Multi-Compartmental Neuron Model. in 9th International Conference on Artificial
Neural Networks (ICANN'99). 1999: London, IEEE.
51 Rust, A.G., R. Adams, and H. Bolouri. Evolutionary Neural Topiary: Growing and
Sculpting Artificial Neurons to Order. in 7th International Conference on the Simula-
tion and Synthesis of Living Systems (AlifeVIl). 2000.
52 Beerli, R.R, B. Dreier, and C.F. Barbas III, Positive and negative regulation of en-
dogenous genes by designed transcription factor.\·. Proc. Natl. Acad. Sci. USA, 2000.
97(4): p. 1495-1500.
53 Elowitz, M.B. and S. Leibler, A synthetic oscillatory netwurk of transcriptional regula-
tors. Nature, 2000. 403: p. 335 - 338.
A Model of Bacterial Adaptability Based on
Multiple Scales of Interaction : COSMIC

R.Gregory, R.Patont, J.Saund~rs, and Q.H.Wu

R.Gregory, Department of Computer Science, The University of Liverpool,


Liverpool L69 3BX, UK.
R.Gregory~liv.ac.uk

R.Patont, Department of Computer Science, The University of Liverpool,


Liverpool L69 3BX, UK.

J.Saunders, School of Biological Sciences, The University of Liverpool, Liverpool


L69 3BX, UK.

Q.H.Wu, Department of Electrical Engineering, The University of Liverpool,


Liverpool L69 3BX, UK.

Abstract

Evolution has frequently been seen as a result of the continuous or dis contin-
uous accumulation of small mutations. Over the many years since Darwin,
it has been found that simple point mutations are not the only mechanism
driving genomic change, for example, plasmids, transposons, bacteriophages,
insertion sequences, deletion and duplication, and stress-sensitive mutation
all have a part to play in directing the genetic composition and variation of
organisms towards meeting the moving target that is the environmental ideal
that exists at any one time. These generate the variation necessary to allow
rapid evolutionary response to changing environmental conditions.
Predictive models of E. coli cellular processes already exist, these tools
are excellent models of behaviour. However, they suffer the same drawbacks;
all rely on actual experimental data to be input and more importantly, once
input that data are static. The aim of this study is to answer some of the
questions regarding bacterial evolution and the role played by genetic events
using an evolving multicellular and multispecies model that builds up from
the scale of the genome to include the proteome and the environment in
which these evolving cells compete. All these scales follow an individual based
philosophy, where by genes, gene products and cells are all represented as
individual entities with individual parameters rather than the more typieal
aggregate population levels in a grid. This vast number of parameters and
possibilities adds another meaning to the name of the simulation, COSMIC:
COmputing Systems of Microbial Interaetions and Communications.
162 R.Gregory, RPaton, J.Saundf~rs, and Q.H.\Vu

1 Introduction
Evolution has frequently been seen as a rmmlt of the continuous or discoIltin-
uous accumulation of small mutations. Over the many years since Darwin,
it has been found that simple point mutations arc not the only mcchanism
driving genornic change, for example, plasmids, transposons, bactcriophagcs,
insertion sequences, deletion and duplication, and stress-sensitive mutation
an have a part to play [1,2] in direct ing the genetic composition and variation
of organisms [3] towards rneeting the moving target that is the environmcntal
ideal at any one time. Considering the probability of single point mutations
arising and rcpair mechanisms that may act to counterac:t their ac cumula-
tion, it is unlikely that simple mlltation alone can create rapid diversity. It is
dear that evolutionary change depends more on larger sc:ale changes in ge-
nornic seqllences causcd by sexual and othcr forms of hori7:ontal gcne transfer.
These generate the variation neccssary to allow rapid evolutionary n~sponse
to changing environmental conditions.
Predictive models of E. coli ccllular proccsses already exist; the E-Cdl
project [4] aims to use gene data dircctly in a mathematical model of tran-
scription. The Virtual Cell [5,6] project makes use of user-defined protein re-
actions to simulate compartments at the nudeus and cellular level. Gepasi3 [7]
also models protein reac:tions, but from within au enc:losed box environment.
The BacSim [8] project simulates individual cell growth at the population
sc:ale. Eos [9] is also based at the population sc:ale, but is intendcd as a frame-
work for testing idealised ec:ologies, represented by evolutionary algorithms.
These tools and those that they rely on are excellent models of behaviollr.
However, they sufIer tlw same drawbacks; ali rely on actual experimental
data to be input and more importantly, once input that data is static. The
aim of this study is to answer SOlIle of the questions re gard ing bacterial evo-
lution and the role played by genetic events other than simple point mutation
using an evolving multicellular and multispecie::; modd that builels up from
the scale of the genome. In efIect, it is not bacterial evolution that is being
interrogated, but the co-evolution of bac-teria and any organism that has a
direct efIect on the genetics of those bacteria.
To test these questions, it is necessary to build a model that attempts to
encompass what are considered the important qualities of bacterial evolution
and bacteriallife, but is not overly specificd as to constrain the results. The
model is therefore a careful balance of biological and computational reali-
ties [10] with an emphasis on open-endedness [11]. The biologicalliterature
has many examples of the possible forms of mechanism within the relatively
'simple' example of E. coli, but even this must be carefully constrained. It
is de ar that computer rnodels lack computational power when comparcd to
real world processes.
In focusing attention on aspects of the E. coli system, it is dear that
there are two new insights provideel by the emerging disciplines of genomics
and proteomics. Proteomics is the study of enzyme and protein interactions.
Model of Bacterial Adaptability 163

Traditionally this meant differential equation models of interaction. However,


nowadays there seems also 1.0 be an implicit link with the application of
protein descriptors derived from sequence information in identified genes [12],
an application that has only recently become tractable with the arrival of
accurate genome data. Genomics is the study of genome structure, interaction
and encoding and has been stimulated by the Human Genome project [13] as
well as whole genome sequencing projects for many other organisms, notably
those for numerous bacteria. The genome should perhaps be regarded not as a
book that is continually read from, but rather a program that is continuously
executed and adapted over the life time of individual cells, tissues or entire
organisms. From this it appears that interactions within cells involve the
combined effects of enzymes, structural and regulatory proteins acting on
genes, which in turn act on those enzymes and other proteins, creating a huge
number of both positive and negative feedback loops necessary for controlled
execut ion [14]. The ideal model therefore is one that takes both these stages
into account and allows for the evolution of the genome in the presence of
other genomes, each genome being an implementation of what many conceive
as the computational cell [15 21].
Looking at this another way, the main aim of COSMIC is evolutionary
modelling based on biologically realistic organisms. Evolution requires some-
thing akin to a genome and the above argument shows a genome by it self is
inert and only part of the overall mechanism. There are then three themes to
the COSMIC model: the environment, the genome and functional proteins,
including enzymes.

2 Biology
2.1 DNA, RNA and Proteins

As it is widely disseminated, DNA is the carrier of genetic informat ion and


stores all instructions for the development of every known living organism.
In its simplest representation DNA is simply a chain of the bases adenine,
cytosine, guanine and thymine, linked together by phosphodiester (sugar-
phosphate) bonds to form a regularly spaced chain. For each DNA strand
there is normally an anti-parallel complementary strand that is annealed to
it by hydrogen bonding. Cytosine complements guanine and adenine comple-
ments thymine (the relationship is reciprocal). As only complementary bases
hybridise, but bases on the same strand are chained together; the complemen-
tary bases point to each other and the phosphodiester bonds run along the
outer edges of the strand pair. The complementary strands wrap around each
other in a coil, and this coil is it self regularly coiled. To put some numbers to
the dimensions of the strand(s), this coil formed by the two coiled strands is
2 nm in diameter and the distance between adjacent bases is 0.5 nm. What
is difficult to imagine is the scale of the object: in humans this double helix is
164 R.Gregory, R.Paton, J.Saunders, and Q.H.~Tll

then coiled around itself to a diameter of 11 nm, coiled around it self again to
a diameter of 30 nm, coiled again to a diameter of 300 nm and then wound
in a 700 nm diameter spring like formation.
The E. coli genome is around 4.6 Mbp (million hase pairs) long and
arranged in a circle, composed of many loops each 50 to 100 kbp long joined
by proteins. The DNA is supercoiled, and is further compaeted by protcins
e.g. HU, a dimeric protein that binds DNA non-specifically and wraps the
DNA around itself.

2.2 Transcription

Part of the central dogma proposed by F.H.C. Crick is that RNA is created
from DNA by a transcription phasc where information from DNA is trau-
scribed into RNA. RNA differs from DNA in that it contains uracil in place
of thymine and is frequently single-strandcd, but may loop back on itsclf
caused by sections of intrastrand homology and base pairing. Some reports
(e.g. [22]) have suggested that RN A could be the initial basis for life. The
transcription process is carried out by RNA polym<'fase (an enzyme specific
for the creat ion of RNA species on aDNA template). For the purposes of the
COSMIC simulation, ali references to RNA that follow should be regarded
as mRNA. However it should be appreciated that other types of RNA are
rcquired by living cells as components of thc translation approach.
The first step in the creation of a protcin from the DNA is transcription
by aDNA-dependent RNA polymerasc (a complex protein-based machine).
RNA polymerase catalyses transcription in a process that requires double-
strandcd DNA as well as the nucleotides ATP, GTP, CTP and UTP. Tran-
scription is directional, with mRNA chain growth progressing in the 5' to 3'
direction using aDNA templatc strand with the opposite polarity (3' -t 5')
(the sense strand). The complementary strand (running 5' -t 3') is called
the anti-sense strand. In E. coli the RNA polyrnerase moves at 40 bases per
second at 37 ac, transcribing as it moves.
To start transcription, an RNA polyrnerasc molecule binds to thc doublp
stranded DNA, ideally at a prornoter site. The RNA polyrnerase and ali re-
quired cofactors constitute a transcription complex. The initial base (start)
of the transcribed region after the promoter is known as the + 1 position and
the promoter and any operator sites have negative nucleotide designations,
negative relative to this position. As with replication, the DN A strands must
be unwound to allow transcription both to be initiated and to proceed. Tran-
scription stops at a terminator sequence which contains a self-cornplernentary
region that can form a stem-Ioop or a hairpin structure out of the RNA prod-
uct. This structure ensures that the transcriptioll complex stops and promotes
dissociation of its constitucnt parts.
Model of Bacterial Adaptability 165

2.3 Protein Structure

There are two dasses of proteins, globular and fibrous. Globular proteins can
be regarded as spherical parti des as they are folded compactly. Most enzymes
are globular. The fibrous proteins have a high axial ratio (length/width)
and are typically used in a structural role. A protein takes a shape dictated
partly by the primary polypeptide sequence and is stabilised by a variety of
forces which hold it together; hydrophilic side chains tend to the outside and
hydrophobic amino acids remain on the inside.
Except for a few catalytic RN A molecules, an enzyme is almost certainly a
protein. An enzyme is a catalyst for reactions that would occur but otherwise
very slowly. COSMIC considers them essential, the large difference in reac-
tion rates makes it not worth considering the case when the reaction occurs
without the enzyme. As a result COSMIC ignores the passive components
and just models the enzymes.
This is also supported by another limitation, a complete system woulel
illvolve the interplay of basie chernical buileling blocks and physical processes,
reqlliring the implementation of alI the chemical and physical processes that
can also possibly occur. Unlike more fixed simulations, COSMIC must be
open cnded and so must sUPPdrt a complete range of possible chemi cal and
physical interactions without favouring one set of fine graineel interactions
over another set. But, as there are more processes than time to implement
them and too many important processes have no predictable function in an
evolving environment, COSMIC is deliberately limiteel to avoiel fine graineel
chemistry and undecidable processes; insteael the focus is on the concepts of
optional transcription and evolutionary mechanisms.
The encoding from D N A to protein function is ill elefineel (hence the
protein foleling problem [12, 23] ). The bases are reaei three at a time as
c:odons anei converteel to 20 different types of amino aci el that are linkeel by
peptiele bonels to make a polypeptiele chain. There are 64 potential codon
combinations, many more combinations than requireel to encoele alI possible
amino aciels, but allowing inclusion of two start coelons, three stop coelons
and the use of varying degrees of redundancy in codon specificity for each
amino acid. Although the base alphabet is 20 letters in size, the polypeptide
can fOfIn words of several hundred amino acids in length. This combineel
with the fact that shape (influenced by the encoeling) elictates the role of
the protein then makes nearly intractable the task of eletermining the role
of a real protein. The shape can take any imaginable three-dimensional form
(obviously the chain cannot go through itself) , using regular patterns that
change over its length.

2.4 Optional Transcription

To further the level of complexity, there is variat ion in gene expression anei
control, as enzymes anei regulatory proteins may be useel to control transcrip-
166 R.Gregory, R.Paton, J.Saunders, and Q.H.Wu

tion of a region of DNA. Controlling proteins that are either transcription


repressors or activators that may themselves be controlled by other activa-
tors.
The process it:> partly explained by the Jacob-Monod theory, which is
generally applied to prokaryotic gene expression. The theory uses the lInit of
the operon that is made up of an adjacent grollp of structural genes known
as cistrons (cistron being equivalent to a gene), preceded by an operator
regiml. The operator region forms a site on to which a repressor protein
(DNA-binding protein) can attach. Once that is attached, the mRNAt:> species
encoded by that region and hence the protein are not manufactured. If an
inducer molecule is present, it can bind to the repressor molecule and nullify
the effect of the repressor. There are also the cases involving a corepressor
whereby the repressor will only bind to the operator region when it has
already combined with another cellular component of the required type. The
regions on the DNA strand can be shown with Z representing the cistrons,
P representing the promoter and O representing the operator. Repressing or
activating proteins reflect both the genotypic and environmental or genotypic
state of a bacterial ce11, creating fairly rapid complex processing without a
nervous system.

2.5 lac Operon

The principal idea of optional transcription cornes from research on the lac
operon found in E. coli. This is only one heavily researched example using
the common idea of optional transcription.
The lac operon is concerned with the use of lactose as a carbon source,
the enzymes that can enable the utilisation of lactose as a carbon and en-
ergy source are only created when lactose is available. The lac operon con-
sists of the structural genes lacZ encoding ,B-galactosidase, lac Y encoding
a galactoside permease and IacA encoding a thiogalactoside transacetylase.
,B-galactosidase is an enzyme that hydrolyses lactose into galactose and glu-
cosp. Galactoside permease aids in lactose transport through the cell wall of
the bacteriurn.
These tluee genes are encoded side hy side in a singlc transcriptional
unit called lacZYA. Relative to this there is an operator site 01 ac between
-5 and +21 bases relative to the transcription start point, just after the
promoter site Plac- If the operator site binds a lac repressor protein then
transcription is strongly repressed. The lac repressor protein it self is encoded
slightly upstream of the lac operon promoter in the lacI gene, which is also
part of the lac operon.
The lacI gene encodes the repressor protein, but the protein itself is active
only as a tetrarner, the gene product only works when in groups of 4. Once this
has taken place, the repressor has a very strong affinity for the lac operator
and also a high affinity with non-operator DNA.
Model of Bacterial Adaptability 167

The lac operator site is palindromic, consisting of 28 bp which read the


same starting at either the 5' or the 3' encls. The lac repressor has the same
symmetry when grouped in a four unit tetramer.
In the absence of lactose, t he reprcssor protein hinds to the operator
site, though it is thollght that this does not stop the RNA polymerase from
binding and instead just stops its progress. The binding of the lac repressor
to tlw operator site increases by two orders of magnitllde the affinity of the
RNA polymcrase to the promoter site, making it qllite likely that an inhibited
operon also has hound RN A polymerase.
""hen repn~ssed the lac opcron gelwrates a very low level of gene prod-
uct. Wlwn lactose is pres(mt, tlw low level of expression allows slow uptake
of lactose, some of which is converted to allolactose. Allolactose binds the
lac n~pressor, changing its affinity for the operator site and sa forcing the
unbinding of the represso1'. As the RNA polymerase will probably be already
present, transcription can start imrrwdiatdy. The removal of the ladose in-
dueeI' leads to a rapid inhibition of transcription, as rehinding of the repressor
is almost immediate and the lacZYA TINA transcript is very unstahle.
The promoter site Plac and other related promoters have it strong affiu-
ity for RNA polymerase, the -35 sequcnce can be weak and even the -10
sequcnce can be weak in othcr promoters. For high ratl'S of transcription
initiat ion of the lac operon a specific ac:tivator protein called a cAMP recep-
tor protein (CRP or Catabolitc Activator Protcin, CAP) is reqllired. CRP
exists as a dimer that cannot by it self have any effe('t on transcription rate.
'Vhen glllcose is absent, the level of cAMP increascs aud CRP binds to cAMP
prodllcing a CRP-cAMP complex that binds to the promotcr site slightly I1P-
stream from wherc thp RNA jlolyrnerase would bind. The DNA is bcnt by
the presence of CRP, forming a 90° bend which is believed to IIlultiply TIN A
polyrnerase binding affinity by 50-fold. In practice, the location of the CRP
binding site can vary much more between operons than statcd here, the site
can be on tlw promoter, next to the promoter OI' be much further ujlstream.

2.6 trp Operon

The tryptophan operon encodes five structural genes that arc required for
tryptophan synthesis. The TINA transcript produced is a single 7-kb long
strand, starting from the operator sit(~ Otrp' As with the expression of the
lac operon, the RNA product is Ilnstable and so reglllation of DNA qllickly
n~gulates the protein encl prodllct, which in this case is tryptophan.
The trpR operon is the sourcc of the trp reprcssor and is locatcd sornc-
wlwre upstream of the trp operon. The operator seqllence is symmetrical and
forms the repressor binding sit<~, whic:h also overlaps with the trp promoter
site hetween bases -21 and +3. The core repressor binding site is a palindrome
18 ilp long. Tlw trp repressor only actively binds the operator site when it
has itself formed a complex with tryptophan. The rcpressor is a dimer and
168 R.Gregory, R.Paton, J.Saunders, and Q.H.Wu

has a structural similarity to CRP protein and the lac repressor, the dimer
needs two tryptophan to be complete. It is the tryptophan that gives the
dimeric structure the correct distance between its two reading heads and its
central core.
The five structural genes encode for enzymes that produce tryptophan.
Tryptophan therefore inhibits its own synthesis by a magnitude of 70 (pre-
sumably under artificial tryptophan simulating conditions), which is much
smaller than that caused by binding of the lac repressor.
So far, the trp operon seems like the lac operon, but for the self-inhibition.
As well as the normal transcription controls there is also an attenuator se-
quence following a le ader sequence using around 162 nt before the first struc-
tural gene trpE. This attenuator is a short area rich in palindromic GC bases
followed by each U bases and it is also rho-independent. If this sequence man-
ages to form a hairpin structure in the transcribing RN A, then it will act as
a terminator and force early termination at around 140 bp long, stopping
before the structural genes have been reached.
The leader itself also has a role to play; divided into four successive se-
quences, 1 and 2, 2 and 3, and 3 and 4 are complementary, and so can bind
to themselves to form a hairpin which stops further RNA transcription. If 2
and 3 bind then this does form a hairpin, but does not stop transcription.
Under normal conditions the binding of 1 to 2 and 3 to 4 is more favourable
than 2 to 3.
Aiso in this leader is an efficient ribosome binding site and successive
codons encoding for tryptophan. Under conditions of low tryptophan avail-
ability the ribosome would pause at this point. Since transcription and trans-
lation are tightly coupled in E. coli, the net effect then is to control trypto-
phan transcription negatively. In reality the hairpin formation between se-
quences 3 and 4 is more likely when tryptophan level is high, the pause
occludes sequence 1 leaving sequence 2 to bind with sequence 3. In the alter-
native case, the pause occurs at the start of sequence 2 and so sequence 2 is
occluded allowing sequences 3 and 4 to form a hair pin.
Given both forms of optional transcription, tryptophan-dependent repres-
sor and tryptophan-dependent attenuation region, the totallevel tryptophan
can be amplified by 700 times, the attenuation sequence giving a lQ-fold
increase and the tryptophan-dependent repressor giving a 70-fold increase.
Generally, attenuation is present in at least six other operons with a role
in amino acid synthesis. An example is the his operon, but in this case the
attenuation mechanism is the only means of control, there is no operator.

2.7 An E. coli Environment

It is clear that prokaryotes, including E. coli, rely on horizontal transfer of


genetic information [24]; this comes about both from contact with DNA, by
cell-to-cell contact (conjugation) and through infection by bacterial viruses
Model of Bacterial Adaptability 169

(bacteriophages) occasionally carrying non-viral DNA. An environment is


clearly needed to simulate bacteria, phages and 'environmental DNA'. An
environment is also a useful tool with which fitness can be implicitly defined.
Fitness is defined here as a measure of genome convergence towards the ideal.
However, the changing nature of the environment ensures that there is no
single ideal and that any better solutions may become inferior over time.
More immediate genome fitness is obtained by direct response to the en-
vironment, the environment is enriched with glucose in a patchy distribution
and each cell has a number of receptors that transport glucose to the cell
interior. The genome in combination with the protcome normally responds
with activation of flagella biosynthesis allowing motility by a swimming ac-
tion [25). COSMIC does not synthesise the flagella: simulation timc can be
shortened by assuming they exist; instead it is glucose that activates the ro-
tation of flagella. There are a variety of possible bacterial carbon and sugar
sources; metabolic modelling is not the focus of COSMIC so only glucose is
currently implemented.

3 The Genome and the Proteome

On the surface, a model gcnome would seem to be best represented by a


simple string of letters. The literature shows that although this is true, there
are layers of interpretation that can be placed on top of the simple repre-
sentation. Looking at genome maps [26) shows that the strings are divided
into non-uniform lengths, each of these identifying some gene or other active
string sequence. COSMIC considers thc genome to follow this representa-
tion with the DNA string representation just under the surface as a method
of assigning gene labels and mutually re active enzymes. Sequences can be
broadly categorised into those that encode a protein and sequences that act
as regulatory structures on whic:h proteins (or further nucleic acid sequences)
act [27,28); the reality is that these types overlap, and multiple subtypes can
be formed from the same scquence. As a result of this, the genome in COS-
MIC dynamically encodes eight broad sequence types and allows for mixed
types of single gencs. Sequence type then follows from pairing fixed types,
such as the interface to tlH' environment and control sequences, to dynamic
types defined by strings that were originally defined to be only genes. Se-
quenec pairing is based on the anti-match of the DNA like encoded strings
and so responds to the changing genome, but is also consistent over time.
Typically, models of differential equations provide the practical simulation
metaphor for biological simulations. At present, it is only E-Cell that incorpo-
rates both the genome and the proteome. Differential equations, however, are
very much akin to statistics: they represent mcan quantities and ignore the
irregularities of the systcm in question. COSMIC uses the individual-bascd
lIlodelling approach in which each IIlolecule is considered as an individual
170 R.Gregory, R.Paton, J.Saunders, and Q.H.\Vu

cntity with individual parameters. This a110ws for the inclusion of spatial ef-
fects [29] without imposing artificial spatial boundaries of fixed rcsolution.
Given a population of functional polypcptides in the ce11, each has a chance
of reacting with each other and on the genome. The prohahility involved is
based on their type as defined by the genome, their position, their half-life
and their age. Each reaction lasts for a variable time based on the type of
pairing. At this point, it must be pointed out that simplification of the real
biologic al process has lead to COSMIC having no explicit molecule type on
which emwmes react. A single type of generic protein is assumed to exist
and a11 reactions act on that pseudo protein. This protein can be bound by
enzymes, as we11 as the normal process of enzymes acting on each other; the
resulting effect is that the enzymes and the single FAP type are not availablc
to do anything else while bound, a110wing another molecule a chance to bind.
Another simplification is in the area of motility: there is a direct relationship
between motile response and optional transcription. Chemical simulation is
beyond the scope of COSMIC and yet ftage11a motion is primarily created
by proton motive forces over the ce11 wa11 [25], coming from both electri-
cal potential and chemical potential. As a result, this simplification seems
justified.
It is hoped these necessary simplifications have little impact on the cvo-
lutionary results obtained with COSMIC, the primary point of enquiry being
the change in the genome over the course of medium and long term time;
the genome and the concept of the operon [27,30,31] are intended to givc
the COSMIC populat ion their adaptability to respond and adapt; the com-
pleteness of genome operators was therefore seen as important, gene types
such as operators, promoters, repressors, anti-repressors, terminators, RNA
polyrnerase and FAPs (Flage11a Activation Proteins) form the main classes.
These types should give the genorne the ability to express any function re-
quired without the need of realistic metabolites.

4 Model

The fu11y specificd COSMIC model [32] consists of a hierarchy of sets or


objects. The most basic set is the gene, which is part of a genome set, which
is part of a ceH set as part of the environment. Each level contains additional
pararneters associated with that level, for instance genes and genomes contain
spatial informat ion that partly compute reaction probabilities. A conceptual
view of the components in an individual ceH is shown in Fig. l.
Frorn the genes come gene products: a single gene can lead to many gene
products so with each gene is potentia11y a set of gene products. Each gene
product contains spatial parameters and a calculated time to live (the term
half-life seems inappropriate when describing an individual rather than il
populat ion ).
Model of Bacterial Adaptability 171

~ Fl age ll :l
>-< Receplors
:.00.' Gcnomc
0.:.:0 Pathways
000 Proteins

L Fig. 1. Conceptual view of a ceH

Gp,ne produets ean react with other gene produets and operate direetly
on the genome. Deerypting this relationship requires several steps. Firstly,
because the relationships between gene and protein function are eurrently
intraetable (espeeially for eompletely unknown genes), funetion mapping has
been eonstrained to use only a direct hamming distanee between genes. A
gene (whieh always leads direetly to its gene product) can react with an-
other gene if the hamming distanee is below a speeified threshold. When
specified for all genes, these relations give a picture of whieh gene produet
(enzymes) can reaet. Not ali sequences are genes, control sequences are also
considered in the hamming process. This creates interaction paths between
control sequences and gene products, thus allowing these gene products to
then be called sigma factors, repressors and inducers. A summary of pos-
siblc interactions, sources of transcription products and a map of indirect
or direct attenuation are shown in Fig. 2. It is important to note that the
term 'enzyme' in COSMIC is used loosely to account for a gene product that
influences a chemical reaction.
Given a genome comprising genes, the gene products of each and a set
of relatiollS describing what can happen, the only missing set is of reactions
that are happening. This is based on the relations of possible reactions , each
relation carries with it a set of reaction instances. Each instance carries with
it references to the specific eIl7:yme instances involved as weU as a mutual
position in space and a reaction duration based on the type of relation.
172 R.Gregory, R.Paton, J.Saunders, and Q.H.Wu

•........

Fig. 2. Enzyme type interaction

The only aspect missing from this system is sense/response networks. For
this COSMIC considers receptors on the cell wall (input) and fiagella motion
(output) to be directly related to the transcription networks. The receptors
represent imitation gene products that exist in a fixed position. These can
bind to normal gene products using a parallel set of relations also based on
hamming distance.
The fiagella response (output) follows in the same way. In both cases, the
effect is not one of conversion, i.e. a receptor places a protein in the cell and
that goes on to be converted to something el se by gene products. Crossing
the protein and chemical scale was never the remit of COSMIC. Instead, the
mode of operation follows that of use, if a gene product is bound at one
location it cannot also be involved in a reaction elsewhere. This is quite a
simplification, but as the role of COSMIC is mainly in evolution it seems
justified.
The largest scale of COSMIC is the environment. To test evolution, a
population of cells must compete side by side for what exists in a common
environment. To this end all cells live in a glucose-rich environment that is
depleted over time by the cells. In common with Genetic Algorithms, better
cells in better regions of the environment will grow faster and so multiply
faster. For this COSMIC uses quantitatively accurate cell growth equations
[8,33,34J.
Model of Bacterial Adaptability 173

The combination of input reward based on cell position and cell position
based on flagelIum out put produces an indirect reward based system that is
the basis on which the simulated E. coli evolve.
Evolution and other causes of cell death are then supplied by the other
organisms and mutations in the farm of both simple point mutation and the
longer gene sequence duplication and gene sequence deletion. This creates
an ecology where it is good to be fit and also good to be genetically unique
[35,36]; an ecology such as this also alIows for a dynamic population size.

5 Implementation
The COSMIC system is clearly very large, the lack of explicit boundaries,
the individual nature of simulation and the need always to model multiple
cells to stimulate evolution alI contribute to this. This requires considerable
computational resources in order to simulate evolution at a reasonable pace.
For this COSMIC currently runs an a GRID enabled cluster of 12 node dual
processor AthlonXP 2000+ machines. This leads to a rapid simulation but is
still slower than real time, on the order of 7:1. In the space of 4 days COSMIC
had evaluated 1,176 bacterial cells, with 170 stillliving at the point the simu-
lation ended. The final environment had turned into a bacterially challenging
patchwork of nutrients, average final genomes were in the range 48 to 822
genes long (168 mean), with 10 to 104,633 enzymes per cell (12,693 mean).
CPU utilisation varied in the range 95-99% creating around 2.5 gigabytes of
data per day for 13 h of simulated bacterial life. This required 11 gigabytes
of total storage space with additional storage for visualisations.
The architecture follows the client server model, the environment being
the server and cells being the clients. This ensures efficient parallelisation as
cell intercommunication is rare and the environment has minimal processing
needs, the overall re suit being a linear growth in computational nodes allow-
ing a linear growth in the product of simulation speed and total simultaneous
number of cells.

6 Results
The COSMIC system clearly has two distant scales, the environment state
aud the state of each cell. COSMIC can generate a vast amount of detailed
raw data on the environment and more importantly, the gene expres sion of
each cell coupled with gene linkage maps. The gene expression data show
that even random loosely connected genomes can achieve a wide range of
transcription rates and connection to the environment.
COSMIC can be considered a batch process, but for the never-ending
naturc of the batch. The architecture ultimately allows a never-ending simu-
lation in which state can be recorded and reloaded while changing the global
174 R.Gregory, R.Paton , J .Saunders, and Q.H.Wu

and local parameters. The current COSMIC produces a variety of common


visualisations; the common feature is the abstract nature of the labels in that
real proteins are not simulated and so data can be isomorphic but not eas-
ily comprehensible, these being per cell gene expression charts and network
graphs representing interacting genes. Also generated are per cell graphed
averages of major parameters and snapshots of the environment at the popu-
lation level as well as charts showing the lineage of ali cells. For unconverged
and therefore random genomes such as these, however, most make little sense
without careful individual analysis.

6.1 Environmental Macroscopic View

Fig. 3. Environmental change after 5,530 s

The vicw of the environment is presented in Fig. 3. White equates to a


glucose level of 4.5 mg, black equates to none. This area is 0.2 mm square.
Black circles represent bacteria that have not moved through lack of connec-
tion with their ftagella, grey streaks show moving bacteria. (Bacteria in this
system cannot move without leaving somc visible trai!.) Per ceU glucose use
has been exaggerated to better motivate evolutionary change.
The genomes are random with only a small degree of mutation aud so
intelligent behaviour canuot be observed. What can be seeu is cell 42 and
Model of Bacterial Adaptability 175

ceH 177 moving slowly near the bottom right of the picture and the majority
of cells not moving at aH.
The labelling system of cells is based on an ever increasing unique iden-
tifier that a cell obtains upon creation as either a child or a newly initialised
ceH. When a ceH divides the parent retains the original identifier and the child
receives the next unu sed identifier. At that instant both cells are the same
in terms of genetics but differ in the share of enzymes as each enzyme has a
50:50 chance of staying with the parent. It is an implementat ion decision that
parents retain their identifier rather than obtain another identifier; the result
of this decis ion is seen in the following charts which show the figures for cells
42 and 177 over the course of their lives without ever referring to the children
of 177 or the other children of 42 despite them being effectively equivalent
at the instant of division and being subjected to the same level of mutation.
This is for convenience and brevity as these two cells represent a major linage
and contain genetic differences, making them good candidates to single out
and present from aH those available. The convenience comes from avoiding
pasting together results, for instance from ceH 177 then switching to cell 255
and then switching again to cell 276. This is possible but unnecessary when
cells at division are effectively equivalent.

6.2 CeH Lineage

Sturt=881 Otfspring=2S9
End=48599 Max Depth=27
Span=47718

881(86431....

2025(-1.956).... '-"4~->1tJ1I
2981(814881
4469(810481:::: "jo':::,,_>177
5517(83931
5910(81087;":'''' ....
_... ····177->2011
6997(814711....

8675(89641
9639(83071.... ""177->255
9946(814441
11J90(âI4)"'"iir" ....
•• _- ····255->270

Fig. 4. Cel! 42 lineage

Lineage charts collect together related cells in a hierarchy, starting with


the original parent ceH and its random genetics through related generations
which inherit the entire parental genome and on average 50% of the enzymes.
Figure 4 shows ceH 42 and the start of its entire lineage, initially seeming
to struggle for life, but then divid ing to create ceH 177. Then something
176 R.Gregory, R.Paton, J.Saunders, and Q.H.Wu

obviously changes, as 42 dies but 177 goes on to produce a large population of


related cells. This lineage stiH existed several hours later when the simulation
was stopped, having 108 cells of the same lineage. The annotation shows the
ceH identifier when spontaneously created (above the line), the ceH identifier
of the death (below the line), the division of cells (to the right of the graph)
and the left hand side showing absolute time (simulated seconds) with time
difference to the previous event.
As used here the labeI 42 can then represent a lineage starting with the
original ceH identifier 42, making 177 a branch of this same lineage. The labeI
42 can also represent aH the ceHs that had the 42 identifier, denoting a part
of the complete lineage. However, specifying the time and a ceH identifier
uniquely identifies a ceH, given a time of 5,530 s the labeI 42 refers to a ceH
that is the third generat ion offspring of the original cell 42. Here we generally
refer to the latter (ie. time and cell identifier) even if no time is specified
as individual ceHs are the focus of attention. The diagrams and graphs that
follow refer to the partial linage from the convenience of giving parents the
same identifier.
Not all cells that exist in the environment are shown here, this represents
the cells whose common ancestor was cell 42. At the time other cells from
other lineages existed, mostly from very short lived cells that had one or two
children before their entire lineage died out.

6.3 Gene Expression


COSMIC produces gene expres sion charts for alI possible expressible genes at
a maximum time resolution of 8t s. However a more useful resolution of 1 s is
used in Fig. 5 showing gene expression for 82.5 min from initialisation, 15 min
per row (separated by white borders). Normally colours and shades of colour
better indicate the levels on a logarithmic scale, shades of blue being the
lowest, green being the highest and vertical grey bars marking unused space
that compensates for the change in genome size over time. As presented here,
shades of grey are used throughout. AH other greys represent the translated
blue, green and red scales. Analysis of these expres sion data is currently
limited to manual inspection, indicators being gene usage, the change in
expression pattern and the more obvious cases of gene insertionjdeletion.
Figure 5 shows celliineage 42, the initial ceH of the long running lineage.
As can be seen from the first row, there were many gene duplication events,
leading to an overall enlarging of the genome. Gene expression took some
time to settle down and never fully converged on any one pattern. This could
only be a good sign that the relatively small changes brought about through
mutation are having a large effect on the proteome. There are three cell
divisions hidden here, at 10.7, 35 and 72.3 min, the latter division being ceH
177.
Figure 6 shows cell lineage 177, the third generation offspring of ceH 42
and what later become the parent of a large number of ceHs. As can be seen
Model of Bacterial Adaptability 177
." .1 ) .) ~ •.? ." t7 .! ioll "10 .11 t!2 tI)
.,., IIIIIIIIIIIIIIIIIIIIIIIIIIIIIIIIIIIIIIIIIII~"II"1111111111111111111111111111111111111111111
0-15 • ~

111111111111111111111111'1""'"111111111111111111111111111111111111111111

Fig. 5 . Lifetime gene expression of a celllineage 42 over an 82.5 min period . Shades
of grey equate to the log of enzyme concentration. Vertical grey areas fiII unu sed
space later filled by an expanding genome

Fig. 6. Lifetime gene expression of cell lineage 177 over an 97.5 minute period.
Shades of gray equate to the log of enzyme concentrat ion. Vertical gray areas fiII
unused space later filled by an expanding genome.
178 R.Gregory, R.Paton, J.Saunders, and Q.H.Wu

this ceH is also constantly changing, though the speed of change is not so
dramatic. Comparing the start and the end shows how much the new ceH
changed from its parent and it can also be seen how much larger the genome
grew despite growth and reduction being the same probability.

6.4 Network Graphs

COSMIC also generates graphs representing interaction networks within each


ceH. The meaning of an interaction is based on the types of node sharing a
common edge. Figure 7 shows the interaction that occurred within ceH 42
over its lifetime, random and so no real structure. Nodes show genes and
imitation genes of input/output receptors. Edges show relationships between
these interactions, either transcriptional relations (adjacent genes) or control
relationships (aH the types shown above). Inside each node is the gene code
and the current type associated with this gene. Figures on the edges show
total usage counts for binding and unbinding reactions.

6.5 CeH Statistics

Figure 8 gives some overaH statistics for ceHlineage 42. The top graph shows
ceH volume; starting from some initial value aH ceHs increase to 0.4 fi then
divide, growth being dependent on substrate. Three ceH divisions can clearly
be seen. The next two graphs show the ceH x/y position in the environment
(as shown in environment view figure) , over the range of 0.2 mm . Both
graphs show the ceH is moving at some speed and so must be covering fresh
substrate regularly - an initial requirement of a converged ceH. The third
graph shows total enzyme population over time. As the enzyme population
is divided when the ceH divides, the approximate halving of ceH contents can
be seen coinciding with halving of ceH volume. The penultimate graph shows
receptor activity over time, that is the ability of the optional transcription
network of the ceH to bind with glucose receptors on the ceH waH - it is
regarded as the ceH input. As can be seen, there is some activity, but this
would never be enough to guide the ceH inteHigently around the environment.
The final graph shows fiageHa response, the equivalent of ceH output. This
shows the ability of the network to interact with the fiageHa and so move
the ceH. As can be seen (largely by the x/y graphs) there is sufficient ceH
movement to achieve near maximal growth. The important point here is that
output lacks control, this unconverged ceH keeps the ceH moving regardless
of input.
Figure 9 shows the same overaH parameters as previously but for ceH
177 lineage, the third generation offspring of lineage 42. The most obvious
difference between the two is the number of enzymes, in this case it is now
relatively constant; quickly recovering from ceH divisions. The unusual case
that appears in so many COSMIC ceHs is at time 9.5 x 103 . There is a rise in
Model of Bacterial Adaptability 179

Fig. 7. Partial network interactions over the lifetime of ceH 42. Nodes represent
genes. Inside each node is the particular gene sequence given to that gene, along
with a gene type in shortened form , the long form of which is given in Fig. 2. Edges
show a relationship between the particular genes; this includes reflexive edges which
indicate an inhibited RN A polymerase. Figures on the edges show total usage counts
for binding and unbinding reactions
180 R.Gregory, R.Paton, J.Saunders, and Q.H.Wu

the enzyme population, a fali, a halving caused by cell division, but then the
ceU never reaUy recovers and loses a connection with the environment (both
input and output). COSMIC then eventuaUy kills the ceU as it is considered
no longer viable.

7 Discussion
The COSMIC model is a growing tool with which evolution can be mod-
elled in a greater detail than ever before allowed. The problems this brings
are twofold, the computational effort is significant and stands to limit the
evolution of COSMIC itself. However the biggest hurdle is the shear amount
of data generated. The individual based philosophy is clearly then a double-
edged sword, escaping global averages means always considering individuals.
This leads on to further stages of COSMIC development: tools to analyse the
out put in a meaningful and concise way.
Model of Bacterial Adaptability 181

E x p(~ri mellt rIIIl 0206 10- ()OOOOO..t2

0.4
0.38
0.36
0.34
0.32
0.3
0.28
0.26
0.24
0.22
0.2
0.1
10 20 30 40 50 00 70 90 100
CcI1 Simlllatioil Time (minu les)
200
~/------------------------ - ----_. 180
160
140
.2 120
100
~
o
80
60
40
20
o
20 30 40 50 60 70 90 100
Cell SimulatiQu Time (min\ltes)
4 500 ~--~---r---r---,----r---~---r---r---,----r---~---r---r.~-,--~r---~---r--~ 4500
4000 4000
3500 3500
3000 3000
2500 2500
2000 2000
1500 1500
1000 1000
500 500
O ~~ __- L__ ~ __ L-~ __- L__ ~ __ L-~ __- L__ ~ __ L-~ __- L__ ~ __ L-~ ____ o
10 20 30 40 50 60 70 80 90 100
Cell imulatiOIi Tiru@ (tuillutes)
0.09 0.09
0.08 0.0
0.07 0.07
0.06 0.06
0.05 0.05
0.04 0.04
0.03 0.03
0.02 0.02
0.01 0.01
O ~ __ L-~L-~~~
o
10 90 100
Ceu Simulation Time (minutes)
0.25 .----r---......--......-.....'T1. .r r - 0.25

0.2 0.2

0.15 0.15

0.1 0.1

0.05 0.05

OL-~~~~~uu~~~u.~~~~~
o
10 20 30 40 00 70 80 90 100
Cetl Sim\llation Timp (minutes)

Fig. 8 . Cell lineage 42 st atistic,


182 R .Gregory, R.Paton, J.Saunders, and Q.H.Wu

Exper iment rnn02061 0- 00000 177


OA
0.38
0.36
0.34
0.32
0.3
0,2
0.26
0,24
0. 22
0. 2
0,1
90 100 110 120 130 I ~O 150 160 190
CeH Simu lal ioll Ti nu> (min u le~ )
200
1O
160
140
120
100 100
60 60
60 60
~ ~
W W
O ~~~~ __ ~~~~ __ ~ __ ~~~u- __ L-~ __ ~ __ ~~~~ __ ~ __ ~~ __ ~~ O
90 100 110 120 130 140 150 160 170 180 190
Cell Simula t ioii T irul"! (mi nnt.f:s)
8~ ~~---r--~--r-~--~--~--r-~~~--~--r-~~~--~--~~~~--~~ 8000
7000 7000
6000 6000
5000 5000
4000 4000
3000 3000
2000 2000
1000 1000
O ~~ __ ~ __ ~~ __ ~ __ L-~ __ ~ __ ~~ __ ~ __ L-~ __ ~ __ ~~ __ ~ __ L-~ __ ~ O
90 100 110 120 130 140 150 160 170 1O 190
Cc'lI ilTlU ltll ion Tinu' (tnin ll u~ )

0.1

0,08

0.06

0.0,1

0.02

O
180 190
Cell Simulation Timc (m inutes)
0.4 ~~---r--~--r-~--~--~--r-~--~--~--r-~--~--~--r-~~~---.--. 0.4
0.35 0,35

r
0.3 0,3
0.2. 0.25
, 0.2 0.2

""
"-
O
0.15
0,1
0,15
0. 1
0.05 0,05
o ~ua~~~~~~~~L-~L-~-L~~~~~~~~~~~~
O
90 100 110 120 130 140 150 160 170 160 190
Cell Simulation Timc (minules)

Fig. 9. Cel! lineage 177 statistics


Model of Baeterial Adaptability 183

References
1. Shapiro,.1. A. (1997). "Genome organi7:ation, natural genetic engineering and
adaptive mutation", 71rf'.nd.~ in Genetics, 13,98-104.
2. Shapiro,.1. A. (1999). "Genome System An:hitecture and Natural Genetic En-
gineering in Rvolution", Annals of New Yor'k Academy of Science.~, 870, 23-35.
3. Koch, L. (1993). "Genetic Response of Mic:rohes to Rxtreme Challenges", .Jour-
nal of Theorf'.tico.l Riology, 160, 1-21.
4. Tomita, M., Hashimoto, K., Takaha'ihi, K., Shimi7:u, T., Matsu7:aki, Y.,
Miyoshi, F., Saito, K., Tanida, S., Yugi, K., Venter, .1. c., Hutchison, C. (1999).
"R-CRLL: Software environment for whole cell simulation'·. Rioinformatics, 15,
72-84.
5. Schaff, .1., Fink, C. c., Slepellenko, R., Carson, .1. H. & Loew, L. M. (1997)
,. A General Comput,at.ional Framework for Modeling Cellular Strudure and
Fllnc:tion", Riophy.~ical .Jollrnal, 73, 1135-1146.
6. Schaff, .1. & T.oew, L. M. (1999). "The Virtual Cell". Pacific Sympo.~itlm on
Riocomptlting, 4, 228-239.
7. Mendes, P. (1997). "GEPASl: a software paekage for modeling the dynamics,
steady states aud control of bioehemical and other systems.", Trend.~ in Rio-
chemical Science, 22, 361-363.
8. Kreft., .1.-U. , Rooth, G., & Winpenny,.1. W. 'T. (1998). "RacSim, a simulator
for inrlivirlllal-baserl morlelling of hac:terial colony growth". Microhiology, 144,
3275-3287.
9. RonRma, R, Shackleton, M. & Shipman, R.. (2000). "Ros - an evolutionary anrl
Reosystem research platform". RT Technology .Jmlrnal 18, 24-31.
10. Way, R C. (2001). "'The role of computation in modeling evollltion'·. RioSy.~­
tem.~, 60, 85-94.
11. Kampis, G. (1996). "Self-morlifying systems: a morlel for the C'onstrnc:tive origin
of information'·. RioSy.~tf'.ms, 38, 119-125.
12. Karpllls, K., Sjlander, K., Barret., c., CIine, M., Hallssler, D., Hughey, R.., Holm,
L. & Sander, C. (1997). "Predirting prot.ein Strnrture nsing hirlrlen Markov
models". Prof.f'.in.~: StrlldllTf'., Jiltmdion, and Gf'.nef.ic.~. 134-139.
13. Kent., W ..1. & Haussler, D. (2001). "GigAssemhler: An Algorit.hm for the Initial
Assemhly of t.he Hllman Genome'·. In PTf'..~.~.
14. Freeman, M. (2000). "Feerlback control of intereellular signalling in develop-
ment'·. Nat11rf'., 408, 313-319.
15. Bray, D. (1990). "lnt.ereelllllar signalling a<; a Parallel Distributed Process",
.Jo71rnal of Theoretical Biolog:,! 143, 215-231.
16. Shapiro,.1. A. (1991). "Genomes as smart systems". Gf'.nP.tica, 84,3-4.
17. Bray, D. (1995). "Protein molecules a<; compntational elements in Iiving cells".
Naf.llTf'. 376.
18. R. Paton (1998). "The Ecologies of Hererlitary Information", C:,!hf'.mf'.tic.~ f~
Hllman Knowmg, 5, 31-44.
19. Dllan, Z., Holc:ombe, M. & BeII, A. (2000). ,. A logic: for biological syst.ems".
BioSy.9tf'.m.~ 55, 93-105.
20. Reddy, V. N., Liebman, M. N. & Mavrovollniotis, M. L. (1996). "Ql1alitative
analysis of hiochemical reaction systems". Comp71tf'.r.9 in Bio[ogy and Medicinf'.,
26,9-24.
21. Arkin, A. & ROHS .1. (1994). "Computational Functions in Biological RE'aetion
Networks". Biophy.~ica[ .hl1lmal 67, 560-578.
22. Gilbert., W. (1987). "The Exon Theory of Genes". Sympo.9ia on Qllantitative
Diolog:,!, 52, 901-905.
23. Hallssler, D. (1997). "A Brief Look at Some Machine Leaming Problem in
GE'nE'ties'·. Pro(~eeding of COLT 97.
184 RGregory, RPaton, J.Saunders, and Q.H.Wu

24. Ochman, H., Lawrence, J. G. & Groisman, A. G. (2000). "Lateral gene transfer
and the nature of bacterial innovation". Nature, 405, 299-304.
25. MacNab, R. M. (1996). "Flagella and Motility". Escherichia coli and
Salmonella, cellular and molecular biology. Second Edition, Neidhardt, F.C.
et al. (eds), 1, 123-144.
26. http://www.ecocyc.org
27. Record Jr, M. T., Reznikoff, W. S., Craig, M. L., McQuade, K. L. & Schlax, P.
J. (1996). "Escherichia coli RNA Polymerase (E?70), Promoters, and the Kinet-
ics of the Steps of Transcription Initiation". Escherichia coli and Salmonella:
Cellular and Molecular Biology, Second Edition, Neidhardt, F. C. el al (eds).
2, 792-821.
28. Collado-Vides, J., Gutierrez-Rios, RM. & Bel-Enguix, G. (1998). "Networks of
transcriptional regulation encoded in a grammatical model". BioSystems 47,
103-118.
29. RayChaudhuri, D., Gordon, G.S. & Wright, A. (2001). "Protein acrobatics and
bacterial cell polarity". PNAS, 98, 1322-1334.
30. Turner, P. C., McLennan, A. G., Bates, A. D. & 'Vhite, M R. H. (1997). " Instant
Notes in Molecular Biology". BIOS Scientijic Publishers., Abingdon.
31. Wagner, R. (2000). "Transcription Regulation in Prokaryotes". Oxford Univer·-
sity Press.
32. Gregory, R, Paton, R.C., Saunders, J.R & WU, Q.H. (2002). "COSMIC: Com-
put ing Systems of Microbial Interactions and Communications 1 - Towards a
Modelling Framework" , In preparation for submission to Biosystems.
33. NeijsseI, O. M., Teixeira De Mattos, M. J. & Tempest, D. W. (1996). "Growth
Yeild and Energy Distribution". Escherichia coli and Salmonella: Cellular and
Molecular Biology, Second Edition, Neidhardt, F.C. et al. (eds), 2, 1683-1692.
34. Bremer, H. & Dennis, P. P. (1996). "Modulation of Chemical Composition
and Other Parameters of the Cell by Growth Rate". Escherichia coli and
Salmonella: Cellular and Molecular Biology, Neidhardt, F.C. et al (eds) 2,
2nd Ed. ASM Press, Washington DC. 1553-1569.
35. Krebs, J. R & Davies, N. B. (1987). "An introdllction to Behavioural Ecology".
Blackwell Scientijic Publications, Oxford.
36. Devine, P., Paton, R. & Amos, M. (1997) "Adaptation of Evolutionary Agents
in Computational Ecologies". Biocomputing and emergent computation: Pro-
ceedings of BCEC97. World Scientijic, Lundh, D., OIsson, B. & Narayanan, A.
(eds.), World Scientific, Singapore. ISBN 981-02-3262-4, 66-75.
Stochastic Computations in N eurons and
Neural Networks

Jianfeng Feng

COGS, Sussex University, Brighton BNI 9QH, UK and Newton Institute,


Cambridge University, CB3 OEH, UK.
jianfeng~cogs.susx.ac.uk.

Abstract

We review some of our recent results on establishing a neuronal decision


theory and spiking !CA (independent component analysis). For neuronal de-
cision theory, we show that the discrimination capacity of a model neuron is
a decreasing function of inhibitory inputs. Increasing the output variability
of neuron efferent firings implies an improvement of neuron discrimination
capacity. For the two most interesting cases, with or without inhibitory in-
puts, the critical discrimination capacity is exactly given. For spiking ICA,
by a simple combination of the Informax principle and the input-output re-
lationship of a spiking neur()n, we first develop a learning rule. By applying
the learning rule to linear mixture of signals, we demonstrate that spiking
neuron network can accomplish ICA tasks.

1 Introduction
Neurons fire randomly, at least in the cortex. As a consequence, it is a long
standing question in neuroscience: what are the advantages of random firing
in neuronal computation? Or what are the functional roles of random firing?
There is a corresponding question in computer science: can we make use
of the randomness, mimicking the random firings in neurons, to solve some
practical problems? There are many publications in the literature to explore
these issues [1]. Here we review some of our recent results and, hopefully,
they will shed some new lights onto these fundamental problems.
It has recently been argued that statistics might be the foundation of neu-
roscience [2]. In statistics, at least in cert ain periods, it was widely accepted
that the statistical decision theory is the basis of the whole of statistics.
Hence applying the decision theory in statistics to, or developing a decis ion
theory for, neuroscience is of vital importance. The actual neuron mechanisms
underpinning the discrimination activity remain one of the most significant
and puzzling problems in neuroscience, despite there having been mounting
experimental and theoretical results devoted to the topic (for example see
recent reviews [3,4]). In a series of experiments, Newsome and his colleagues
have compared single neuron activity with psychophysical experimental data.
186 Jianfeng Feng

They found , surprisingly, that the information extracted from single neuron
activity in middle temporal (MT) is almost enough to account for psychophys-
ical experimental data. Hence an observation of the firing rates of a single
neuron, at least in MT, contains enough information to further guide mo-
tor activity. Imagining the enormous number of neurons in the cortex, their
findings are striking and open up many interesting issues for further theo-
retical and experimental study. Interestingly, similar findings are reported in
somatosensory pathways [4] as well. In line with these experimental results ,
we concentrate on the relationship of the input and output firing rates of
a single neuron. The first issue we are going to address is quite straightfor-
ward (a discrimination task, which is the basis of decision theory, see Fig.
1). Suppose that a neuron receives two set of signals (coded by firing rates)

Mixed
....

Fig. 1. For two mixed signals (left), after neuronal transformation, will they become
more mixed or more separated?

distributed according to two histograms as depicted in Fig. 1 (left). Will the


signals become more mixed or separated, after neuronal transformations?
More specifically we consider neuron models with a combination of (co-
herent) signal inputs and masking 'noise' inputs. The models we employ here
are the integrate-and-fire (IF) model and the IF-FHN model [5]. We find
that with a small fraction of signal inputs, thc efferent spike trains of thc
model contain enough informat ion to discriminate between different inputs
(see below for more details) .
Stochastic Computations in Neurons and Neural Networks 187

We then explore the possible functional role of inhibitory inputs on dis-


crimination tasks. A neuron extensively receives both excitatory and in-
hibitory inputs. It is clear that the excitatory input codes the input informa-
tion: the stronger the stimuli are, the faster the neuron fires. Less is known
about the inhibitory input, much as different, theoretical hypotheses have
been put forward in the literature ranging from actually synchronizing the
firing of a group of neurons [6], linearizing input-output relationship [7] and
increasing neuron firing rates [5] etc. We find that adding certain amount of
inhibitory inputs considerably enhances the neuronal discrimination capabil-
ity if signal inputs are correlated.
The conclusion above seems quite counter-intuitive. We all know that
increasing inhibitory inputs to a single neuron model will result in an increase
on the variability of its efferent spike trains [5]. The histogram of firing rates
will thus become more spread out and, as a consequence, the discrimination
of different inputs becomes more difficult. However, this is not the case. To
understand the mechanism underpinning the observed phenomena, we then
go a step further to theoretically explore the model behaviour. Based upon
the IF model, a theory on discrimination tasks is developed. We find that
two key mechanisms for achieving a better separation of out put signals, in
comparison with input signals, are:
1. Input signals are positively correlated.
2. Excitatory inputs and inhibitory inputs are exact1y balanced.
Without correlations, no matter how strong the inhibitory inputs are, the
separability of the out put signals and the input signals is identical: if the
input signals are separable, so are the out put signals and vice versa. With
correlations, the stronger the inhibitory inputs are, the better the separation.
Theoretically the critic al value of the coherent inputs at which the out-
put histograms are separable is exactly obtained (Theorem 2) for the case
of correlated and exactly balanced inputs (the most interesting case). The
results enable us to assess the dependence of our conclusions on different
model parameters and input signals. It is illuminating to see that the critical
value is independent of model parameters including the threshold, the decay
time and the EPSP (excitatory postsynaptic potential) and IPSP(inhibitory
postsynaptic potential) magnitude.
All the aforementioned results are obtained for the IF and IF-FHN model
without reversal potentials; we further examine our conclusions for the IF
model with reversal potentials. Since adding revers al potentials to a model
is equivalent to increasing its decay rate (depending on input signals), we
would naturally expect that the model with reversal potentials will be able
more effectively to distinguish different inputs. The conclusion is numerically
confirmed.
During the past few years, inhibitory inputs (see for example [8,9]) aud
correlated iuputs (see for example [1012] are two topics widely investigated
188 Jianfeng Feng

in neuroscience. It seems it is generalIy accepted that they play important


roles in information processing in the brain. Our results here provide a con-
vincing and direct evidence to show that they do improve the performance of
a single neuron. Such results would also be valuable on practical applications
of spiking neural networks [13].

The second issue is to solve some engineering problems using a random


neuron network. During the past decades, we have seen many illuminating
publications on modelling single neurons, both at abstract and at biophysical
levels [14]. Many intriguing phenomena have been revealed such as how to
ensure a single IF model to generate spike trains with a coefficient of vari-
ation between 0.5 and 1 [15~ 17], to carefulIy tune the noise level to exhibit
stochastic resonance, to synchronize a network of spiking neurones with in-
hibitory inputs etc. [18]. Nevertheless, the majority of them are devoted to a
'dead' neuron model: the model is not able to learn when it receives inputs.
We study neuron models with a learning rule, Le. the neuron is capa-
bIe of updating its weights of inputs. The simplest and most general prin-
ciple of a learning rule is probably the informax principle [19]. It has been
demonstrated that the inform ax principle is theoreticalIy promising [2,19],
biologicalIy plausible [20] and widely applicable in engineering [21~23].
By a simple combination of the informax principle and the IF model, we
want to address the folIowing problems.

~ What are the implications of the informax prin cip le with a network of the
IF neurones? In other words, what is the outcome of the informax principle
learning? After learning, do the weights tend to self-organize themselves,
represent inputs or accomplish something else?
~ What is the computational capacity of an IF neuron? Can the IF neuron
be applied to solving practic al problems? We have seen applications of the
IF model, or more general spiking neuron; to solving engineering problems,
where the information contained in spike intervals is exploited. We test the
computational capacity of the IF neuron in blind separations, using a rate
coding assumption.
~ What is the optimal value of the ratio between inhibitory and excitatory
inputs? We alI know there are inhibitory inputs in the neural system, but
the functional role of them remains elusive.

Under the informax principle, the learning rule developed for the IF model
is complicated and so a complete theoretical treatment is impossible. For
some ideal cases, we could understand its underpinning mechanisms [20]. For
general cases, we have to resort to numeric al simulations and alI the questions
raised before are answered.

~ The implication of the informax principle is to dilute connections between


neurones, i.e. a more sparse representation of inputs is achieved. With fulIy,
Stochastic Computations in Neurons and Neural Networks 189

randomly initialized connections, some weights will automaticaUy die out


after learning.
- We test the computational capacity of the lF model in blind separations.
The incoming signal for each single neuron is a Poisson process, implying a
very high signal-to-noise ratio signal. Nevertheless, very crude simulations
show that the computational capacity of the lF model is promising. It can
successfully separate incoming, mixing signals.
- For neurons with both constant inputs and time-varying inputs, we find
that the IF model behaves more reasonably when there are certain amount
of inhibitory inputs.

Taken together, we first assess the advantages of random firing in neurons


in discrimination tasks. A neural decision theory is developed. We then apply
a random, spiking neuron to accomplish independent component analysis
(spiking leA). Due to the space limit, We have not included aU results here
and refer the reader to related papers for more details [7,20,24).

2 The Integrate-and-fire Model and its Inputs

The first neuron model we use here is the classical lF model [15,16,25).
When the membrane potential Vi is below the threshold Vihre, it is given by

dVi = -L(Vi - Vrest)dt + dIsyn(t) (1)

where L is the decay coefficient and the synaptic input is


P q
Isyn(t) = aL Ei(t) - b L Ij(t)
i=l j=l

with Ei(t),Ii(t) as Poisson processes with rates Ai,E and Ai,! respectively,
a > D, b > O are magnitude of each EPSP and lPSP, p and q are the total
number of active excitatory and inhibitory synapses. Once Vi crosses vthre
from below a spike is generated and vt is reset to Vrest , the resting potential.
This model is termed the lF model. The interspike interval of efferent spikes
is
T = inf {t : vt ~ l'thre}
More specificaUy, synaptic inputs take the foUowing form (p = q)
P P
Isyn(t) = a LEi(t) - b L1j(t)
i=l j=<
Pc P Pc P

=aLEi(t)+a L Ei(t)-bLI;(t)-b L Ii(t)


i=l i=l
190 Jianfeng Feng

where Ei(t), i = 1,··· ,Pc are correlated Poisson processes with an identical
rate Aj,j = 1,2, Ei(t) are Poisson processes with a firing rate ~i-Pc in-
dependently and identically distributed random variables from [O, 100], i =
Pc + 1,'" ,P, Ii(t),i = 1,···,p have the same property as Ei(t), but with
a firing rate of r Aj, j = 1,2 or r~i-pc for r E [O, 1] representing the ratio
between inhibitory and excitatory inputs.
From now on, we further use diffusion approximations to approximate
synaptic inputs [25] and without loss of generality we assume that a = b and
Vrest = O.
P-Pc P-l1c
Isyn(t) = apcAjt +a L ~it - bpcrAjt - b L r~it
i-1 i-1

where B t is the standard Brownian mot ion and j = 1,2. We first consider
the case that a neuron receives independent inputs. As we might expect, the
output from a single neuron does not contain enough information for the
discrimination task (results not shown, see next section), with the ratio of
inhibitory to excitatory inputs spanned from nil to one (exactly balanced
inhibitory and excitatory input). We then turn to the situation that a small
amount of correlations are added to the synaptic inputs which code coherently
moving dots. For the simplicity of notation we assume that the correlation
coefficient between ith excitatory (inhibitory) synapse and jth excitatory
(inhibitory) synapse is C > O. The correlation considered here reflects the
correlation of activity of different synapses, as discussed and explored in [5].
It is not the correlation of single incoming EPSP or IPSP which could be
expressed as Cij(t - ti) for the EPSP (IPSP) at time t of the ith synapse and
time ti of the jth synapse. We refer the re ader to [5] for a detailed discussion
on the meaning of the correlation considered here.
In summary, suppose that a neuron receives P synaptic inputs. The goal
of the postsynaptic neuron is to discriminate between two types of inputs:
1. Pc excitatory Poisson inputs fire at a rate Al and Pc inhibitory Poisson
inputs fire at a rate r Al with r E [O, 1].
2. Pc excitatory Poisson inputs fire at a rate A2 (A2 i- Al) and Pc inhibitory
Poisson inputs fire at a rate r A2 with r E [O, 1].
In both cases, the neuron receives 'noise' Poisson inputs consisting of p - Pc
excitatory inputs and the same number of inhibitory inputs. We as sume that
'noise' excitatory inputs are uniformly distributed between O and 100 Hz,
and 'noise' inhibitory inputs are between O and lOOr Hz. Without loss of
generality, we always assume that A2 > Al,
Stochastic Computations in Neurons and Neural Networks 191

,
I \

,,
I

,
,,
,, ,
, ,
Ul I p (out}(Â) ,
E '2 ,
~ I \
O>
o
Cii
I ,,

Firing rate (Hz)

Fig. 2. A schematic plot of two output histograms, R min(A2) and Rmax(Ad

3 Theoretical Results

In this section we concentrate on theoretical results. Let A be the set of


input frequency of the model , which is [0,100]. It will become obvious that ali
theoretical results are independent of this choice. For a fixed (A1 E A , A2 E A)
with A[ < A2 we have corresponding two histograms Pl (A) and P2(A) of output
firing rates as showlI in Fig. 2. Let

and

and denotc
(2)
If it is clear from the context about the dependence of a( A1 , A2 ' e, r) on e, 1" , we
somctimes simply write 0(A[,A2,c, r) as 0(Al , A2)' Hence for fixed (A1 , A2),
O(A[, A2) gives us the critical value of Pc: when Pc > O:(Al' A2) the input
patterns arc pcrfectly separable in the sense that the the olltPllt firing rate
histograms are not mixcd with TPM=O; when Pc < ct(A[, A2) the inpllt pat-
terns might Bot bc separable with TPM> O. Note that we consider the worst
case here and in practical applications , the critica) vaille of Pc at which the
inpllt pattcrns are perfcctly separable, as fOllnd in the previolls section, is in
192 Jianfeng Feng

generallower than 0:( Al, A2' e, r). From now on, all figures are generated using
the same parameters as in the previous section, if not specified otherwise.
Here is the basic idea of our approach. As pointed out before, it is not easy
to directly calculate the distribution of (T). Nevertheless, the discrimination
task is only involved in the left most point of P2(A), i.e. R min(A2), and the
right most point of Pl (A), i.e. Rmax(Al), provided that both P2 and Pl are
positive only in a finite region. This is exactly the case for the model we
considered here sin ce neurons fire within a finite region.

3.1 Behaviour ofn(AI,A2,c,r)

First of alI, we want to explore the behaviour of R min(A2) -Rmax(Al). In Fig.


3, Diff = R min(A2) - Rmax (Ad with different values of a and Al = 25 HZ,A2 =
75 Hz are shown. In all cases we see that it is an increasing function of r and
O:(Al' A2' e,r) is a decreasing function of r.

Theorem 1 Let Amax = maxp EA} = 100 Hz, we have

(3)

As we have mentioned before, to find the distribution or the variance of


(T) is a formidable task. Here, based upon the basic observations that
- The out put firing rate is an increasing function of inputs
- Input firing rate is confined within a finite region, which is of course the
case in neuroscience
we simplify our task from finding out the variance of (T) to solving an algebra
equation defined in Theorem 1. Theorem 1 is the start ing point of all following
results.

Theorem 2 When e = O we have

O:(Al, A2' O, r) = A P;ma\


2- 1+ max
independent of r. When e > O we have
(4)
Stochastic Computations in Neurons and Neural Networks 193
20 10

15
• Pc = 60
• • o •
• • O
+ Pc = 55
oPe = 50
10

• • - 10

;e
o
5
• • O
O
i3 - 20 • •
O
O
-3
• • • p,= 40
o
O o + P, = 35

- 10 O -4 o o Pc = 30
O o
- 15
o 0.2 0.4 0.6 0.8 -5 o 0.2 0.4 0.6 0 .8
Ratior Ra~o,

25 10

20 Pe = 60
o •
+ Pc = 55

15 o Pc = 50

10

- 10
• O

;e
o 5
• • ;e
o -20
O
O

O
O
O • Pc = 40
O + Pc = 35
-40 O
- 10 O
O o Pc : 30
O O
-150
0.2 0.4 0.6 0.8 o 0.2 0.4 0.6 0.8
Aalior Ratior
20 10


• Pc = 60 5
15
+ Pc = 55 o
o Pc = 50
10 -5


• •
- 10

• •
• •
'"i5 13 -15
• •
O
O - 20

-25
• O
O

O . Pc = 40
O O + Pc::;; 35
O O
-1 O o Pc ~ 30
-35 O
O
- 150 -40
0.2 0.4 0.6 0.8 0.2 0.4 0.6 0.8
Ratio r Ratior

Fig. 3. Diff=R l11 in('\2) - Rmax (.\I) for a = 0.5 (upper panel) , a = 1 (middle panel)
and a = 2 (bottom panel) with .\1 = 25 Hz, .\2 = 75 H:>: and c = 0.1. It is easy to
read out a(.\I, '\2, c, r)
194 Jianfeng Feng

where 1 2': r2 > r[ > O and furthermore


a(A1, A2, C, 1) =
-)[(A2 - A1)(1 - c) + AmaxF + 4pAmax C(A2 - Ad - (A2 - AI)(1- c) - Amax
2C(A2 - Ad

(5)

Before proving the conclusions, we first discuss the meaning of Theorem


2. The first conclusion teUs us that with e = O, no matter how strong the ill-
hibitory inputs are, the critical value of Pc is independent of r. In other words,
without correlated inputs, increasing inhibitory inputs does not enhance the
discrimination capacity of the neuron. In Theorem 3 below, we will further
prove that without correlated inputs, if the inputs are separable, so are the
outputs and vice versa. The second condusion says that the discrimination
capacity of the neuron is improved if the neuron received correlated inputs.
With correlated inputs, an increase in inhibitory inputs does enhance the
discrimination capacity of the neuron. In particular, we see that for a fixed
c > O, the optimal discrimination capacity is attained when r = 1. Hence
Theorem 2 confirms our numerical results on the IF model presented in the
previous section.
To prove Theorem 2, at a first glance, we might want to prove that
a (A[ , A2 , r, c) is a decreasing function of r. Again a direct, brute force calcu-
lation is very hard, if it is not impossible. In the following we employ a more
geometrically oriented proof.

In Fig. 4 some numeric al results of a(Aj, A2) are shown. It is easily seen
that when e = O, a(A[, A2) is independent of r.
\Ve want to point out another amazing fact from Theorem 2. a(A[, A2, O, r)
and a(Aj, A2, e, 1) with e> O are both independent of a, 1Ithr·e, L. When r =
1, c = 0.1, A[ = 25 HII, A2 = 75 Hz and Amax = 100 Hz and p = 100, we have
0(25,75,0.1,1) = 32.5133,0(25,75,0,1) = 66.6667 (see Figs. 3 and 4). Hence
we conchIde 32.5133 < a(25, 75, 0.1, r) < 66.6667 for r E (0,1).
Finally we are in the position to answer one of the questions rai sed
in the introduction: a large coefficient of variation (CV) implies a srnall
rY (Al, A2 , e, r). Note that the CV of interspike intervals is the the variance
when we calculate the mean interspike intervals. In other words, for each
fixed realization of ~i, i = 1,···,p - Pc, it is the variation of T. When we
calculate a(Aj, A2' e, r), the variance of firing rates histogram is mainly in-
troduced via the masking 'noise'. In other words it is the variatioll of (TI.
Therefore these are different sources of noisc. By in cre as ing the number of
interspike intervals, we can reduce the variance of the first kind. Note that
in the previous section we deliberately employed a small number of spikes
(100), which might be close to the biological reality, to estirnate (TI. Tl]{'
Stochastic Computations in Neurons and Neural Networks 195

second kind of varian"e is due to the ftuctuation of input signals, or masking


noise. In condusion, in"reasing inhibitory inputs introduces more variations
when we calculate (T), but irnproves neuronal discrimillation capacity.

3.2 Input-Output Relationship

In tlw previous subsections, we only consider the out put firing rate his-
tograms. It is certainly interesting to compare the input histograms with
out put histograms. As before, let A be the set of illput frequency of the
model. For a fixed (Al E A, A2 E A) with Al < A2 we have two corresponding
histograrns pi (A) and P~ (A) of input firing rates, i.e. pi(A) is the histograrn
of PcAl + Lf~ic ~i and p2(A) is the histogram OfPc A2 + Lf~i' ~i' Define

R:nin (A2) = min{A,p~(A) > O}


and
R:naJAd = max{A,p\(A) > O}
Then the rdationship between R: nin (A2)-R!nax(Ad and R min(A2)-Rmax (Ad
characterizes the input-output relationship of neuron signal transformations.
We first want to assess that whether R min (A2) - Rrnax(>. d > O even
when R:nin(A2) - R~lax(Ad < 0, i.e. the input signal is mixed, but the
output signal is separated. In Fig. 5, wc plot R min (A2) - Rmax(Al) versus
R;nin(A2) - R~lax(AI) = A2Pc - A1Pc - Ama:r(P - Pc), which is a function
of p,.. It is easily seen that after neuronal transformation, mixed signals
arc bctter separated when c > O. For example, when c = 0.1, T = 1 and
R:nin (A2) - R:nax(Ad = -5000 Hz (mixed), but R min (A2) - Rmax(At} > O
(separatecl). The conc:lusion is not true for r: = 0, but the separation is not
worse aft.er mmronal transformation.
Theorem 3 Ii c > O 11Je have
11Jhen

Theorem 3 reveals one of t.he interesting properties of neuronal transfor-


matioIl. Under the assumption t.hat input signals are correlatecl, the output
signals will he separatcd cven when the input signals are mixed. As men-
tioned earlier, we helieve that the fundamental requirement for a nervous
system is t.o t.el! OIl(~ signal from the ot.her. Theorem 3 t.el!s us that, after
the transformat.ion of the IF neuron, the input signals could be more easily
separable.
\Ve have carried out numerical simulations to confirm our results, for the
IF model without revers al potcntials, wit.h reversal potentials, the IF-FHl\'
model [5] and refer the rcader to our fuIl paper [7].
In the next section, we turn OUl' attention to the second issue raised in
the Introdud.ion.
196 Jianfeng Feng

70 100
65

+
+
9
*******
N 55 +
+
!.
(il +
+
ţ..

iii
(\1
50
+ ~ 85 • c= O +
lI'l
1l'
• c =o
(\J + C =0.05
45 + 'B' 80 O C =0.1
+ c = 0.05
40 o c = 0 .1
75
35

300 700
0.2 0.4 0 .6 0 .8 0.2 0.4 0.6 0.8
Ratio r Ratio r
Fig. 4. 0'(>'1, >'2) (Hz) versus ratio r with >'1 = 25 Hz and >'2=75 Hz and >'1 = 25
Hz and >'2=30 Hz. It is easily seen that when c = O, 0'(>'1 , >'2) is flat, otherwise it
is a decreasing function of r

4O r----r----r----r----~--~--~

20
O .,,-

-20 .. '
.. ' ".-'
N
;
==
'il
"S
%-80 - r= 1
O . r =O
-100

- 120 c=O.1

- 140 .

-1~00~--80
.....---..:'60::---
_4~0----!:
20::---~--:'·
5000
Input ditf (Hz)

Fig. 5. Rmin(>'2) - Rmax (>'I) vs. R~'in(>'2) - R:nax (>'I) which is a function of pc ,
for c = O (right) and c = 0.1 (left)
Stochastic Computations in Neurons and Neural Networks 197

4 Informax Principle

We very briefly review some results on maximizing mutual information be-


tween input and output of a system and refer the reader to [22] for details.
Suppose that the output (firing frequency or interspike interval) y of a neu-
ron is a function of input rate x, with synaptic weights w, then the learning
rule under the Informax principle is to maximize -log f (y) where f (y) is the
distribution density of y. Equivalently we also have (Eq. (6) in [22])

'li; Q( (88x Y) -1 ~
8w
(88xY) (6)

In fact, as pointed out in [22], this principle is equivalent to entropy


maximization. For high dimension case we could simply replace 8y / 8x by
the determinant of the Jacobian matrix, that is IJI = 1(8y;j8xj)ij 1.

4.1 The IF Model Redefined

For the purpose of the following sections, we redefine the IF model. Sup-
pose that a neuron receives :CPSPs (excitatory postsynaptic potentials) at
p synapses and IPSPs (inhibitory postsynaptic potentials) at p inhibitory
synapses. When the membrane potential ~(i) of ith neuron is between the
rest ing potential V rest and the threshold Vlhre, it is given by

dv:(i)
t = -L(v:(i)
t - v:rest )dt + dI(i)
syn (t) (7)
where L is the decay rate and synaptic inputs

j==l j==l

with Ei(t), Ii(t) as Poisson processes with rate Af and A{ respectively, w~ >
O, w{j > O being magnitudc of cach EPSP and IPSP. As before, the IF model
ean be approximated by

dvt(i) -
- - L( V t(i) - V.rest )dt + d~(i) ()
Zsyn t

where

p p

z~~n(t) = L w~Aft - L WfjA§t


j==l j==l
p p

+L J(wZ)2AfBf(t) - L J(w[j)2A§Bf(t) (8)


j==l j==l
198 Jianfeng Feng

Since the summation of Brownian motions is again a Brownian motion we


can rewrite the equation above as foHows

(9)

where Bi(t) is a standard Brownian mot ion

(10)

In the sequel, for simplicity of notation, as in the previous sections we


as sume that Wij = w~ = w{j and AJ = rAf = rA for r E [0,1]. Therefore
when r = 0, the ceH receives purely excitatory input and when r = 1, its
inputs are exactly balanced. The interspike interval of efferent spikes is

Ti(r) = inf{t : ~(i) 2: lIthre}

We only consider the case of rate coding since then a rigorous input-
out put relationship of firing rates is known for the IF model. By rate coding,
we mean that the information is carried by the firing rate of a neuron. It is
weU known in the literature that the input-output relationship of a neuron
takes a sigmoidal form and this is the basis of neural computations developed
in the past decades. The input-output relationship of an IF model (see Fig.
7) takes a sigmoidal function as well (not surprising at aU), but it depends
not only on the mean of inputs, but also on the variance of inputs. The
latter feature enables us to derive novel learning rules which, to the best of
our knowledge, have not been reported in the literature and exhibit some
intriguing phenomena. The importance that a neuron might use higher order
statistics to compute has been recognized early in the literature (see for
example [2]).

5 Learning Rule

Remember that the mean interspike interval of the IF model with Poisson
inputs is given by

g(x)dx (11)

where
Stochastic Complltations in Neurons and Neural Networks 199

Output

Inp ut

<
I I I I!
I I I I

IIII
I
111 1 I I I I

Fig. 6. Schematic input-outPllt relationship . Ea.ch unit (circle) represents a group


of IF neurons. The average of spikes over the group of neuron gives rise to the mean
firing rate

r _ 1
........... ..........
...... ..
......
°o~~~--~--~--~--~~
~--~--~--~---:~~
I npu t (HZ)

Fig. 7. Out put (Hz) versus input (kHz, AL = A2 = A3) of the IF model with
n = 3 , W·i.i = 0.5,i , j = 1, ·· ·,3,AI = A2 = A3, Vi"rc = 20 mV,Vr e st = O mV a.ud
L = 1/20

The learning rule under the informax principle is

âlJI
lJI
âWi j
Wij cx (12)

where

and
200 Jianfeng Feng

The matrix J could be rewritten as

where

Therefore

11 -(
J - -1
)p[rrP ((Ti(r)) 1+ T
i=1 ref )2
1. [f-(
~ -1
)i+iB(Ti(r))IA
BAj ij
1]

where Aij is the (p - 1) x (p - 1) matrix obtained by deleting the ith row


and jth column of the matrix A and

which yields

(13)

!
Defining

Ei ~ (ţ w;;Â;)L(1+ r)
ei = (L WijAj)(l - r)
j=1

letting Vrest = 0,
Stochastic Computations in Neurons and Neural Networks 201

and
2(1 - r)E; + 3w;jAjL(1 -r 2 ) + 2WijL(1 + r)(xL - e;)
2E3
3WijAjL(1 + r)[2Wij(1'- r)E; + w;jL(1 + r)(xL - e;)]
if k =j
2E?
WikAjL(1 - r2)(4wij - Wik)
2E3
3WijAjL(I + r)[2wik(1 - r)E; + w;k L (1 + r)(xL - ei)] if k =f. j
2E?

we arrive at

o(Ti(r» =_~ (VihreL-ei)/: .. (v; ) ~ (_ei)/: .. (O)


OA· L9 Ei ""J thre + L9 Ei ""J
(14)
o(T;(r»
OWij
__
- L9
~
(VihreL - ei) . . (v;
Ei 'f]'J thre
) ~ (_ ei) .. (O)
+ L 9 Ei 'f]'J
and

~(O(Ti(r») __ ~ (VihreL-ei)r .. (v; )


OWij OAk - L9 Ei ""Jk thre
(VihreL - ei) VihreL - ei
2 [
+~ 29_ e . Ei 2
II (
E i _ . +_~{ik Vthre 'f]ij Vihre
) ( )
e
+E9 ( Ei') (ijk(O) - E [29 ( 1]
Ei') Ei' + {ik(O)'f]ij(O)
(15)
Combining Eq. (14) and (15) with Eq. (13), we obtain a novel learning
rule based upon the IF model. The first term in Eq. (13) represents how the
weight should update according to its input-output relationship; the second
term relies on the derivatives of Ti(r).
To fully understand the learning rule presented here is a tough issue,
nevertheless for a special case we could grasp a complete picture. Let us
consider the ideal case of Wij = W, >'i = >. and n = 1, i.e. the neuron only has
one input and one output. Now Eq. (13) is reduced to

âlJI -2â(T(r») â 2 (T(r»)


âw _ âw + â>'âw (16)
111 (T(r») + Trei â(T(r»)
â>'
and we have 1](0) = o. After making some further calculations, we obtain the
learning rule developed in [20). For the ideal case, we know that there is
a unique stable point for the learning rule, and the weight is automatically
restricted in the regions of (O, 00 ) .
202 Jianfeng Feng

For the general case the learning rule presented here is too complex to
be explored theoretically, nevertheless we can simulate it numerically, as pre-
sented in the next section.

6 N umeri cal Results

6.1 Supervised Learning

R.ecall that in Eq. (13) we have

alJI
aWij
(17)
IJI

where 'Ii is the efferent firing rate (in unit of l/ms) ofthe ith neuron. There-
fore if we fix (clamp) the efferent firing rate, we could have a version of
supervised learning.
We simulate the learning rule with the following parameters: p = 6, 'Ii =
50 HZ,Ai = 500 Hz for i = 1,2,3, Ai = 2,000 Hz for i = 4,5,6,
r E [0,1], vthre = 20, Vrest = O and L = 1/20, using Matlab. Thc total
cxcitatory input is of 7,500 Hz. The input is equivalent to cach rwuron re-
ceives 100 active synaptic inputs, each with 75 Hz, which are physiologically
plausible parameters [17]. Note that for the purpose of visualizing results,
here we scale down p and scale up A correspondingly. The initial weights are
random variables from U(O, 9) and step size oflearning is 0.05. After 800 time
steps of learning, the synaptic weights are stable, for aH cases we considered
here.
We carry out simulations for r = 0,0.5 and 1. It is shown (see [24]) that
weights become dilution, after learning. For example when r = O we have

(W1l (O), W12 (O), W13 (0), W14 (0), W!.5 (O), W16 (O)) = (7.8,2.9,2.25.5,9.1,5.2)
and

(WIl (800), W12 (800), W13 (800), W14(800), W15 (800), W16 (800)) =
(9.5, O, O, O, 5.3,1.4)
namely the connections

die out. Hence, in general, the connections of an IF model network, under


the Informax learning, become dilution.
Stochastic Computations in Neurons and Neural Networks 203

To achieve a successfulilearning, the ISI of output of each neuron should


be 20 ms. Nevertheless, in the learning rule, we still have one parameter Tr ei
which is free. Hence for an ISI smaller than 20 ms , we could always add an
appropriate refractory period 80 that the output ISI is 20 ms. In other words,
the learning rule is successful if output ISI is less than 20 ms Figure 8 tells
us that in an cases of r = O and r = 0.5, the learning rule is successful, i.e
ali means are less than 20 ms. The conclusion is not true for r = 1 where too
big an interspike interval is obtained . Figure 8 also shows the coefficient of
variation (CV) of ISI. It is interesting to see that the coefficient of variat ion
of the model with r = O is generally smaller than 0.5. For a recent discussion
on the CV of ISI of the IF model, we refer the reader to [17]. In summary,
only when r = 0.5 does the IF model learn successfulIy and generate spike
trains with a coefficient of variation inside [0.5,1) [26) .

2W .---~---------~--~----~

"
12

,o
î
oS 8
r= 1
!!i
~
6
~
"
w

___ ____ _._ . _____ ___ __ .__


r. 0.5 ..
· ~:O
' r. 0.5
- - --- - - - -r : O
O L,~~~~~~~~----~~~

eel Cell

2. 5, ,..---~----~----_----~--__,

r. l c
§ :~
~
~ 1.5 ~
1> "O
r . 0.5 C
<'
.! ' ~
0.6 r. ,
.11
"u'8 0.5
~ O.•
O.S ( =0
(=0 0.3
02
O,
Cell
O., , 3 Celi •
Fig. 8. Mean and CV of output ISIs. Leit are initial values and right are values
after learning

6.2 Unsupervised Learning


Now we turn our attention to unsupervised learning which means that the
output firing rate is determined by the input rate, rather than fixed as in the
previous subsection. We fix Tr' ei = 10 ms in alI simulations.
204 Jianfeng Feng

6.3 Signal Separations

To test the computational capacity of the IF model, we take into account a


toy model of time-varying inputs, as in [27]. For a gene"al background of
blind separations, we refer the reader to [21 ~23]. The input signals are

30~-- __- - - -__ ----~-- ____ --~---- __- -____- - - '

100 200 300 400 $00 600 700 aoo


Tlmo

10
Qutput
B

4 Inpl".Il

Tlme

Fig. 9. Input signals (lowe1' trace of each figure) and 01ltput signals (uppe1' trace)
after blind separations with l' = 0.5

a(t) = 2 sin(400t)cos(30t)
{ b(t) = 2 sign{sin[500t+9cos(40t)]} (18)
c(t) = 2 (rand - 0.5)

for t 2: O and rand is the uniform random numbers from [O, 1]. The input
signals are mixed with the matrix
Stochastic Computations in Neurons and Neural Networks 205

0.3985 -0.1966 0.0021)


M = ( -0.0801 0.2436 -0.0072
0.0070 -0.0715 0.1134

and a constant vector (2,2,2) is added (to ensure that input signals are
positive), i.e. the received signals are

()'l (t), A2(t), A3(t))' = M(a(t), b(t), c(t))' + (2,2,2)'


and inverse matrix of M is

3.00002.45000.1000)
( 1.0000 5.0000 0.3000
0.4450 3.0000 9.0000

The weights are updated with a rate of l/(t + 100)1/2, an initial state of
randomly generated number from U(O, 10). After 800 times of iterations we
obtain
0.36790.00525.6465)
w(800) = ( 0.1533 1.26190.1755
6.3680 4.4491 0.0023
which gives us the results as shown in Fig. 9. The out put signals are obtained
via

We also run the same simulations with r = O and r = 1. It is interesting


to find that when r = 1, i.e. with exactly balanced excitatory and inhibitory
input, input signals are difficult to recover. In alI simulations we carried out,
the weights quickly diverge to large numbers. In contrast, when r is small
(r = O,r = 0.5), input signals are reasonably recovered.
Comparing Fig. 9 with Fig. 10, we find that a better blind separation of
input signal is achieved when r = 0.5 (see b(t)). We thus concIude that the IF
model has a better performance when certain amounts of inhibitory inputs
are added in the inputs. FinalIy in Fig. 11, the output firing rates of three
neurones are shown with r = o.
Now we test our algorithms on people's faces. Each face in Fig. 12 is a file
of 287 x 384 pixels. Vsing the initial w(O) as before and input signals from
[100,200] x [100,200], we recover the faces as shown in Fig. 12.

7 Discussion

We have considered the problem of discriminating between input signals in


terms of an observation of efferent spike trains of a single neuron. We have
demonstrated, both theoreticalIy and numericalIy, that two key mechanisms
206 Jianfeng Feng
25 r---~-------------------------------------,

100 200 300 400 SOO 600 700 &00


Timo

Time

Fig. 10. Input signals (lower trace of each figure) aud output signals (upper trace)
after blind separations with r = O

to enhance the discrimination capability of the model neuron are to increase


inhibitory inputs and correlated inputs.
We have also presented a theoretical and numerical approach to derive
novel learning rules based upon spiking neurons. In particular , for the IF
model and both supervised learning and unsupervised learning, we show that
the model under the informax principle tends to dilute its connections and a
certain amount of inhibitory inputs are required to optimize its performance.
The conc\usions are tested with both constant rate inputs and time-varying
inputs.
We have only considered the IF model under the rate coding assumption
here. Is the learning rule developed here applicable to time coding? The
answer is affirmative. Remember the learning rule developed in the previous
sections
Stochastic Cornputations in Neurons and Neural Networks 207
30 ~----------------~----------------~------.

28

"00
It Of"ofion 1in"los

Fig. 11. Efferent interspike intervals versus tirne with r = O, see Fig. 10

Fig. 12. Upper panel is the original faces , each with 287 x 387 pixels. Lower panel
is obtained after 100,000 iterations, r = 0.5

alJl
âWij
(19)
IJI
208 Jianfeng Feng

if we simply use "fi and Ai as instantaneous firing rate, i.e the inverse of

interspike interval, a learning rule based upon the time is obtained [28 32].
We will explore it in further publications.
The approach presented here is quite general. In principle, once we know
the relationship of the input-output of a neuron, we can obtain the learning
rule of the neuron. As we have mentioned in Sec. 1, we have accumulated
many results on the in put-out put relationship of a single neuron, for example
the IF-FHN model [33]. We expect our approach can also shed more light
onto the cod ing problem [29].
It is also very interesting to note that for all cases we considered the best
separation is achieved for a(t), which is supposed to be a more natural signal
than the other two. Whether it is an intrinsic property of the IF model is
not clear at the moment. There are many issues to be further explored in the
future. For neuron decision theory:
- We have only attempted to accomplish the discriminating task and have
not included time constrains. Definitely it is of vital importance for a neu-
ronal system 1,0 tel! one signal from the other within as short a time window
as possible.
- We have tested our model with static inputs. It is an interesting question to
generalize our results here to time-varying inputs as reported in [4]. Such
a study might be helpful to clarify the ongoing debate on the advantages
of 'dynamical stimuli' over the 'static stimuli'.
- The input signal used here is very naive. To transform the image of moving
dots to input signals specified in the present paper requires a neural net-
work to preprocess the image. Hence to devise a network model (spiking
neural networks or Reichardt detector [34]) to reproduce our results is one
of our ongoing research topics. We expect that such a study could provide
us with a template to compare models with psychophysical experiments.
- A neuronal system without learning is a 'dead' system. In actual situa-
tions, we all know that learning is prevailing in neuronal systems. Hence a
reasonable learning rule should improve the neuronal capability of discrim-
ination of different input signals. There are severallearning rules reported
in the literature. To assess the impact of them on discrimination tasks is
an intriguing issue.
Discriminating between different input signals is probably a more fun-
damental constraint on the neural system than others such as maximizing
input-output informat ion or redundancy reductions, a view recently echoed
in [2]. To understand it will reveal principles employed by neuronal systems
which remain mysterious to us. The issue discussed here is currently a hot
topic in neuroscience (for example see [35]). Our approach provides us with
a solid theoretical foundation for further study and we expect that our ap-
proach will also open up many interesting questions to be furt heI' investigated
in the future.
Stochastic Computations in Neurons and Neural Networks 209

For spiking ICA, as ementioned earlier, we consider a small network of


neurons due to the constraint of both visualizing results and computational
consideration: it is very time-consuming to run a simulat ion with 100 neurons.
Nevertheless we are going to try simulations of large networks in the near
future.
We want to emphasize that although it is generally accepted that neurons
in the cortex receive and send out Poisson process (or more general, renewal
process) spike trains [1], in the literature, to the best of our knowledge, it has
not been rigorously tested what is the functional role of Poisson process input
and output. In contrast, it is widely accepted that the stochastic part of the
input signal is simply noise and harmful. Our approach here, as a first step
towards exploring the truly computational function of the Poisson process,
reveals some primary and interesting properties.

References
1. Albright T.D., Jessell T.M., Kandel E.R., and Posner M.I. (2000), Neural sci-
ence: A century of progress and the mysteries that remain. Cell100, sl-s55.
2. Barlow H. (1986) Perception: What quantitative laws govern the acquisition of
knowledge from the senses? in Functions of the Brain ed. C. Coen, Clarendon
Press: Oxford.
3. Parker A.J., and Newsome W.T. (1998) Sense and the single neuron: probing
the physiology of perception. Annu. Rev. Neurosci. 21 227-277.
4. Ricciardi, L.M., and Sato, S. (1990), Diffusion process and first-passage-times
problems. Lectures in Applied Mathematics and lnformatics ed. Ricciardi, L.M.,
Manchester: Manchester University Press.
5. Feng J. (2001) Is the integrate-and-fire model good enough? -a review. Neural
Networks 14 955-975.
6. van Vreeswijk C., Abbott L.F., and Ermentrout G.B. (1994), Jour. Computat.
Neurosci. 1 313-321.
7. Feng J., and Liu F. (2003) Discriminating between input signals via single
neuron activity (submitted).
8. Hopfield, J.J., and Brody, C.D., (2000), What is a moment? "Cortical" sensory
integration over a brief interval, PNAS97: 13919-13924
9. Hopfield, J.J., and Brody, C.D., (2001), What is a moment? Transient syn-
chrony as a collective mechanism for spatiotemporal integration, PNAS98:
1282-1287.
10. Feng J., and Brown D. (2000b), Impact of correlated inputs on the out put of
the integrate-and-fire model. Neural Computation 12, 711-732.
11. Salinas E., and Sejnowski T. (2000) Impact of correlated synaptic input on
ouput firing rate and variability in simploe neuron mdoels. Journal of Neuro-
science, 20, 6196-6209.
12. Salinas E., and Sejnowski T. (2001) Correlated neuronal activity and the flow
of neural information. Nature Reviews Neuroscience 2, 539-550.
13. Grossberg, S, Maass W., and Markram H. (2001) Neural Networks (Special
Issue), voI. 14.
210 Jianfeng Feng

14. Koch, C. (1999), Biophysics of Computation, Oxford University Press, Oxford.


15. Brown D., Feng J., and Feerick, S. (1999), Variability of firing of Hodgkin-
Huxley and FitzHugh-Nagumo neurons with stochastic synaptic input. Phys.
Rev. Letts. 82, 4731-4734.
16. Feng, J. (1997), Behaviours of spike out put jitter in the integrate-and-fire
model. Phys. Rev. Lett. 79, 4505-4508.
17. Feng J., and Zhang P. (2001), Integrate-and-fire and Hodgkin-Huxley models
with correlated inputs Phys. Rev. E 63 051902.
18. Hopfield, J.J., and Herz, A.V.M.(1995), Rapid local synchronization of act ion-
potentials-toward computation with coupled integrate-and-fire neurons. Proc.
Natl. Acad. Sci. USA 92, 6655-6662.
19. Linsker R. (1989), An application of the principle of maximum informat ion
preservation to linear systems. in Advances in Neural Information Processing
Systems 1, Touretzky D.S. (ed), Morgan-Kauffman.
20. Feng J., Buxton H., and Deng Y.C. (2002) 'fraining the Integrate-and-fire Mod-
els with The Informax Principle I J. Phys. A. 35, 2379-2394.
21. Amari S. (1999), Natural gradient learning for over- and under-complete bases
in ICA. Neural Computation 11, 1875-1883.
22. BeII A.J., and Sejnowski T.J. (1995), An information maximization approach to
blind separation and blind deconvolution. Neural Computation 7, 1129-1159.
23. Lewicki M.S., and Sejnowski T.J. (2000), Learning overcomplete representa-
tions Neural Computation 12, 337-365.
24. Feng J., Sun Y.L., Buxton H., and Wei G. (2003) 'fraining Integrate-and-fire
Models With The Informax Principle II IEEE T NN, 14, 326-336.
25. Tuckwell H.C. (1988), Introduction to Theoretical Neurobiology Voi 2, Cam-
bridge University Press, Cambridge.
26. Shadlen M.N., and Newsome W.T.(1994), Noise, neural codes and cortical or-
ganization, Burr. Opin. Neurobiol. 4, 569-579.
27. Haykin S. (1999) Neural networks, Prentice Hali, Englewood Cliffs, N.J.
28. Bi, G.Q.,and Poo M.M. (1998), Activity-induced synaptic modifications in hip-
pocampal culture: Dependence on spike timing, synaptic strength and cell type.
J. Neurosci. 18, 10464-10472.
29. Gerstner W., Kreiter A.K., Markram H., and Herz A.V.M. (1997), Neural
codes: firing rates and beyond. Proc. Natl. Acad. Sci. USA 94, 12740-12741.
30. R. Kempter, W. Gerstner, and J.L. van Hemmen (1999), Hebbian learning and
spiking neurons Physical Review E 59, 4498-4514.
31. Song, S., K.D. Miller, and L.F. Abbott (2000), Competitive Hebbian learn-
ing through spike-timing-dependent synaptic plasticity. Nature Neuroscience
3,919-926.
32. Zhang L.I., Tao H.W., Hoit C.E., Harris W.A., and Poo M.M. (1998), A crit-
ical window for cooperation and competition among developing retinotectal
synapses. Nature 395, 37-44.
33. Feng J., and Brown D. (2000a), Integrate-and-fire models with nonlinear leak-
age Bulletin of Mathematical Biology 62, 467-481.
34. Borst A. (2000) Models of mot ion detection. Nature Neuroscience 3 1168-1168.
35. Kast B. (2001) Decisions, decisions ... Nature 411 126-128.
Spatial Patterning in Explicitly Cellular
Environments: Activity-Regulated Juxtacrine
Signalling

N.Monk

Centre for Bioinformatics and Computational Biology,


Division of Genomic Medicine, University of Sheffield, Royal Hallamshire Hos-
pital, Sheffield SIO 2JF, UK
n.monk@sheffield.ac.uk

Abstract. Pattern formation in multicellular organisms generally occurs within


populations of cells that are in close contact. It is thus natural and important to
consider models of pattern formation that are constructed using a spatially discrete
cellular structure. Here, the particular case of pattern formation in cellular systems
that depends on contact-dependent Guxtacrine) signalling between cells is dis-
cussed. Spatial and spatio-temporal patterns can emerge in populations of cells
coupled by juxtacrine signalling when the degree of activation of the relevant cell-
surface receptors regulates both the pathway of differentiation adopted by the cell
and the ability of the cell to participate in further juxtacrine signalling. When this
latter condition applies, juxtacrine signalling couples alI the cells of a population
to form a spatially extended signalling network. Due to the essential nonlinearity
of the signalling, such juxtacrine networks can exhibit dynarnics that are quite dif-
ferent to those in networks of cells coupled by linear diffusion. Two simple cases
are discussed here, in which receptor activation either diminishes or enhances the
signalling ability of a cell. In the former case, signalling can act to amplify small
differences between cells via a feedback-mediated competition, leading to stable
spatially periodic patterns (a process known as lateral inhibition). In the latter
case, signalling can result in a range of different patterns, including stable spatial
gradients, propagating fronts, and periodic and quasi-periodic spatial patterns.
These quite simple examples serve to illustrate the potential richness of this im-
portant class of biological signalling, and provide guidance for the development of
more complex models.

1 Introduction

Signalling between cells (intercellular signalling) is a major mechanism of pattern


formation in multicellular tissues. Juxtacrine signalling is a class of intercellular
signalling that depends on the interaction of molecules that are anchored to the
cell membrane. A wide range of juxtacrine signalling systems play important roles
212 N. Monk

in pattern formation [1]. In many cases the nature of the signalling interaction is
such that a single signaHing ceH influences the behaviour only of its immediate
neighbours (for example, the Boss-Sevenless interaction in the developing fly
eye[2]). However, in some cases juxtacrine signalling directs pattern formation in
large populations of cells. This occurs when every cell can act both as a source
and as a recipient of signalling. Patterning depends crucially on the ability of each
cell to regulate the strength with which it signals with its neighbours in response to
the level of signalling it receives. This type of signalling can be described as activ-
ity-regulated juxtacrine signalling (ARJS) [3].

2 Biological Setting

Juxtacrine signalling systems are centred on two distinct types of membrane-


anchored protein: a receptor that can initiate changes within a cell when it is acti-
vated by the binding of a ligand presented by a neighbouring cell. The two cJear-
est examples of ARJS are centred on the Notch and epidermal growth factor
(EGF) receptors.
The best-studied example is centred on the molecules Delta and Notch, which
are large proteins, anchored within the cell membrane (transmembrane proteins.)
Notch is a receptor that can be activated by the binding of Delta. Notch activation
has two consequences: (i) it initiates signalling to the cell nucJeus that regulates
the level of expression of a range of genes [4]; (ii) it results in the down-regulation
of the signalling activity of Delta [5-7]. The down-regulation of Delta activity in
response to Notch activation establishes a regulatory loop between cells as shown
in Fig. 1.

Delta active Notch active


~

Notch inactive Delta inactive

Active Delta
Inactive Delta

Fig. 1. A simplified scheme iIIustrating the main features of Delta-Notch signalling. Delta-
mediated activation of Notch leads both to down-regulation of Delta activity and to signal
transduction
Spatial Patteming in Explicitly Cellular Environments 213

The best characterised patterning role of Delta-Notch signalling is in the gen-


eration of spacing patterns during development by lateral inhibition: a certain
proportion of cells in an epithelial sheet embark on a default pathway of differen-
tiation (primary Jate) and signal via Delta to their neighbours to inhibit them from
following the same pathway of differentiation. The identity of some primary fate
cells, such as those that will give rise to the large bristles (macrochaetae) in flies,
is determined by a smaH pre-patterned bias in the cell population. Other primary
fate cells, such as the progenitors of the sensory hair cells in the vertebrate inner
ear, are believed to be determined by a small stochastic initial bias [8]. In both
cases, the operation of the Delta-Notch feedback loop amplifies small initial bi-
ases in the activity of the Notch pathway, resulting in a characteristic spacing pat-
tern of evenly-spaced primary fate cells (high Delta activity, low Notch activity)
surrounded by inhibited secondary fate cells (low Delta, high Notch).
As with Notch, the level of activation of the epidermal growth factor receptor
(EGFR) influences the rate of proliferation of cells [9]. EGFR plays important
roles in both transformed (cancerous) and untransformed epithelial cells. For ex-
ample, EGFR regulates epidermal wound healing by contributing to the increased
rate of proliferation in cells surrounding the wound. One of its ligands, transform-
ing growth factor a (TGFa), is predominantly presented by cells in a membrane-
bound juxtacrine form [10). Furthermore, the rates of production of both jux-
tacrine TGFa and of EGFR are increasing functions of the level of EGFR activa-
tion in a cell [Il, 12]. TGFa-EGFR signalling thus provides a second example of
activity-regulated juxtacrine signalling.

3 Mathematical Models of Juxtacrine Signalling

Juxtacrine signalling can be studied using either spatially discrete or continuous


formalisms. In the discrete formalism, which is focused on here, each ceH is repre-
sented individually, with the central model variables being ligand and receptor
levels for each cell. When it is sufficient to treat the internal cellular dynamics im-
plicitly, rather than explicitly, three variables can be used to represent the state of
each ceH: free ligand Ai' free receptor Ri' and ligand-receptor complex Ci. The
general scheme for the interaction between these variables in a line of ceHs is
shown in Fig. 2a, where the implicit internal cellular dynamics are represented by
red arrows. Mathematically, this scheme can be represented by a set of coupled
ordinary differential equations. Taking the simplest case of monomeric interaction
between ligand and receptor, represented in Fig. 2b, then

dA/dt =-kaAlR), + kdC, - ţJfl, + PiC,) (1)


dR/dt =-kJA),R, + kdC, - ţJ,R, + PiC') (2)
dC!dt = +kJA),R, - kdC, - ţJ,C, (3)

where (ţ), denotes the average value of the variable q in the cells immediately
neighbouring the cell labelled by i. ka and kd are the association and dissociation
rates of the ligand-receptor complex, respectively, and ţJţdenotes the linear deg-
214 N. Monk

radation rate of the variable ~. Pa and P, are functions encoding the dependence of
the rate of production of A and R on C. The first two terms in each equation de-
scribe ligand-receptor binding, the third term describes linear degradation, and the
fourth term (where present) represents regulated production. This formalism, the
explicit binding model (EBM), has been used in the work of Owen et al. [13-15]
and Wearing et al. [16].

Ligand-Receptor binding

(a) lntraceUu/ar re1:ulation

A+R c kQ ·e
(b) /yj
Fig. 2. a The general scheme for a two-component juxtacrine signalling system in a line of
cells. Ai' Ri and Ci denote the levels of ligand, free receptor and Iigand-receptor complex in
cell i, respectively . Blue arrows represent ligand-receptor binding interactions (juxtacrine
signalling); red arrows represent intracellular regulation. b Simple monomeric reversible
binding reaction between Iigand (A) and receptor (R) to generate a bound ligand-receptor
complex (C)

In some cases, particularly when regulated production of free receptor can be


neglected, and when ligand-receptor binding reaches a steady state on a time-scale
much faster that characterising the internal celIular dynamics, attention can be fo-
cused on the two variables A and C for each celI. This approach, the implicit bind-
ing model (IBM), used by Collier et al. [17] and Monk [18], starts with model
equations of the form

dC/dt =f( 0)) - f.1,( (4)


dA/dt =g(C) - f.1"Ai (5)
It is assumed that ligand binding activates the receptor, such that f is a mono-
tonie increasing function .
Both forms of the model equations presented above are continuous in time and
discrete in space. Owen and Sherratt have developed an alternative approach in
Spatial Patteming in Explicitly Cellular Environments 215

which the model equations are continuous in both space and time. This approach
will not be discussed here; an outline can be found in Monk et al. [3]. Conversely,
Luthi et al. have studied models of juxtacrine signalling that are discrete in both
space and time [19]. Such systems are essentially cellular automata with continu-
ous state variables, and will also not be discussed here. A good review of the use
of cellular automata in biological modelling that is of Ermentrout and Edelstein-
Keshet [20].
Spatially discrete models have sometimes been used to study pattern formation
in lattices of cells coupled by cell-to-cell diffusion (for example, [21]). It is impor-
tant to note that the models discussed here are fundamentally different from such
'discrete diffusion' models, since the interaction between neighbouring cells de-
pends only on absolute levels of ligand activity, and not on the difference in
ligand activity between cells. This distinction leads to significant differences in the
types of behaviour exhibited by the two classes of models.

4 Pattern Formation

The model equations presented above exhibit a range of steady state and travelling
patterns. Patterning in the level of ligand-receptor complex is of particular inter-
est, since this level can have a direct influence on cell properties, such as state of
differentiation, proliferation rate, etc. Techniques for the analysis of these patterns
can be found elsewhere [13-18], and will not be covered in any detail here.
Rather, the general classes of pattern that have been found in analytical and nu-
merical studies will be outlined. Quite generally, the types of patterns observed
depend on the form of the production terms Pa and P" or of the feedback function
g. The following cases have been studied:
1. IBM with g monotonic decreasing [17]
2. IBM with g monotonic increasing [18]
3. EBM with Pa and P, monotonic increasing [13-16]

4.1 Lateral Inhibition and Spacing Patterns

The case of IBM with g monotonic decreasing has been proposed as a simple
model that captures the essence of Delta-Notch signalling [17]. As discussed
above, the operation of DeIta-Notch signalling in an initially equivalent popula-
tion of cells most commonly leads to the development of a spacing pattern, in
which scattered cells with low Notch activity are surrounded by cells in which
Notch activity is high. Equations (4) and (5) have been studied in one- and two-
dimensional arrays of cells. Two classes of steady state solutions are possible: (a)
a homogeneous state; (b) a spacing pattern in which cells have either high or low
levels of receptor activity, cells with low receptor activity (primary fate cells) be-
ing separated by a characteristic distance with intervening cells having high recep-
tor acti vi ty .
216 N. Monk

Which pattern develops over time depends both on the initial conditions and on
the form of the feedback function g; if the steepness of g is below a critic al value,
then the homogeneous state is stable to perturbations and will develop from any
random initial conditions (with slight discrepancies at the boundaries in general).
When the steepness of g is above this critical value, then a spacing pattern devel-
ops. The precise form of the spacing pattern depends on the initial and boundary
conditions, and on the form of g. Typical steady state patterns in one- and two-
dimensional arrays of cells are shown in Figs. 3 and 4.

10 15 20 25 30 35 40 45 50
0011 number
(b) 40
00
300
Q,) 06
.5;
.., 200
100

5 10 15 20 2S 30 35 40 45 50
cellnum ber
Fig. 3. Time-course (in arbitrary units of time) of development of steady state pattems of a)
Delta and b) Notch activities in a line of cells. Cells initially have random low Ievels of
Delta and Notch activity; boundary cells receive Delta signalling only from their
neighbours within the line (zero boundary conditions). Note that the system approaches the
(unstable) homogeneous steady state before levels of Delta and Notch activity diverge
(around time 80)

In one-dimensional arrays in which alI cells initially have roughly the same lev-
els of Delta and Notch activity, the typical pattern that results has a periodicity of
2. That is, cells with high Notch activity are flanked by two cells with low Notch
activity, and vice versa. It is typical for 'defects' to arise in this pattern, where two
neighbouring cells both have high levels of Notch activity. However, pattems in
which two neighbouring cells both have low levels of Notch activity have never
been observed. Alternative patterns can be forced by biasing the initial levels of
Delta and Notch activity. For example, if the initial conditions are biased such that
cells with lower Notch activity (andlor higher Delta activity) are separated by two
cells with higher Notch activity (andlor lower Delta activity), then this prepattern
will be reinforced resulting in a stable period-3 pattern. An example is shown in
Fig. 5a. However, if one attempts to generate the con verse period-3 pattern, in
Spatial Patteming in Explicitly Cellular Environments 217

which high Notch activity cells are separated by two low Notch activity cells, the
initial conditions are overcome (even if they are strongly biased), and a typical pe-
riod-2 pattern with defects results (see Fig.5b.)

Fig. 4. Typical steady state pattern of activity of the Notch signalling pathway in an array
of hexagonal cells. The cells surrounding the array have zero Delta activity (zero boundary
conditions). White represents high Notch activity while black represents low Notch activity.
Note the regions of close hexagonal packing separated by channels of high Notch-activity
cells

In two-dimensional arrays of hexagonal celIs the proportion of primary fate


(Iow Notch activity) cells in periodic stable patterns can, in principle, vary be-
tween 1/3 and 117. However, as is the case for the one-dimensional arrays, the
dominant pattern that arises from unbiased initial conditions is the shortest-range
pattern with 1/3 primary fate cells, with occasional defects containing an excess of
high Notch activity cells (as in Fig. 4). Again, a stable pattern with two neighbour-
ing primary fate cells has never been observed.
The results of this type of model agree quite well with the experimentally ob-
served behaviour of cells interacting through the Delta-Notch system. In such cell
populations, stable configurations result only when no primary fate cells are in
contact, as seen in simulations of the model. The question of the spacing of pri-
mary fate cells during the process of Delta-Notch signalling is more difficult to
address. When primary fate cells are easy to distinguish (when they have under-
gone differentiation), their spacing has typically been increased due to cell prolif-
eration in the population of surrounding secondary fate cells. However, it seems
that in at least some cases Delta-Notch signalling results in a stable pattern of pri-
mary fate cells much like that ilIustrated in Fig. 4. In a number of other cases,
Delta-Notch signalling acts more to reinforce initial biases in cells, rather than to
form patterns of cell fate de novo as in the model presented here[8].
218 N. Monk

(a)

aoo
oa
600
0.6
~ 400
0.4

200 02

10 15 20 25 30 s, 40 4, ,0

ce ll nuntle r
[b)

800

08
600
06
g 400
200 . ~
1- - 1· ..
,
10 15 20 25 so s, 40 " 50
O!! II nul'1"hl!r

Fig. 5. Time-eourse of Noteh aetivity in a line of eells with biased initial eonditions. a) A
foreed period-3 pattern resulting from initial eonditions sueh that eells with a bias towards
high Delta aetivity were separated by two eells biased towards low Delta aetivity. Note that
the left-hand boundary eondition forees a small region of period-2 pattern that overeomes
the bias in the initial eonditions. b) The effeet of initial eonditions biased towards a period-
3 pattern in whieh two adjaeent eells have high levels of Delta aetivity. This period-3 pat-
tern is unstable, and the 'defaul!' period-2 pattern with oeeasional defeets soon develops

4.2 Gradients and Travelling Fronts

IBM with g monotonie inereasing can be considered as a model for the propaga-
tion of asigna! through a population of cells under the influence of a spatially 10-
calised signa! source [18] . Due to the absence of any feedback to balance the posi-
tive feedback inherent in this model, it does not appear to be capable of generating
pattems de nava. As with ali the models discussed here, there exist one or more
homogeneous steady states of the model, in which ali cells ha ve equallevels of re-
ceptor activation (with slight deviations at boundaries in practice). If a boundary
condition is imposed that varies significantly from the homogeneous steady state,
then this boundary can have an effect on the level of receptor activation that
spreads into the cell population. This effect can take the form of either a stable
gradient or a propagating front [18] .
In the case that only one stable homogeneous steady state exists, which is just
the case that the functionj(g(C)/,u) - ,u,C has on!y one zero, then a fixed deviation
from this steady state results in the formation of a stable gradient of receptor acti-
vation. A typical example of such a gradient is shown in Fig. 6. In this example,
the on!y homogeneous steady state is the zero state and a fixed non-zero boundary
condition generates a gradient over a range of 30 eells or so.
Spatial Patteming in Explicitly Cellular Environments 219

005

004

003

0C12

001

6 10 16 20 26 30
C&l1 number

Fig. 6. A typical steady state gradient of receptor activity resulting from unilateral constant
stimulation of line of cells

While the gradients formed in this c1ass of model have an appearance similar to
those formed by the diffusion of a substance from a locali sed source, they are
quite different in some respects. Perhaps most significantly, the range over which
a gradient can form is restricted. This is in contrast to gradients formed by diffu-
sion, whose range can be extended both by increasing the strength of the signal
source and by increasing the diffusion wavelength (given by the balance of the
diffusion coefficient and the decay rate of the diffusing substance). Range restric-
tion stems from two features of the juxtacrine model. Firstly, if the strength of the
signal source is increased, the receptors on the cells neighbouring the source even-
tually become fully saturated; further increases in signal strength will then have no
effect on the form of the gradient. Secondly, if the steepness of f and/or g is in-
creased (roughly equivalent to increasing the diffusion wavelength), then a point is
reached where additional homogeneous steady states come into existence. In this
case, travelling fronts rather than gradients are generated. Thus, while these two
strategies can be effective in increasing the range of gradients in the juxtacrine
model, there are strict upper limits to their use, beyond which they either have no
effect Of lead to the generation of distinct modes of response.
In the case when there exist two or more stable homogeneous steady states, a
locali sed perturbation to one of these states can result in the establishment of a
travelling front propagating away from the site of the disturbance. Typically, two
such stable steady states are separated by an unstable homogeneous steady state
(see Fig. 7).
220 N. Monk

Fig. 7. A plot of the functionj(g(C)/,u) - ,u,C showing the existence of three homogeneous
steady states, Co' C" C,. Coand C, are stable, C, is unstable. In this example, ,u" = ,u, = l.t(~)
= ;/(0.2 + ;) and g(~) = ~

Let the levels of receptor activation in the two stable states be CII and C,. and
that in the intervening unstable state be C" with C2 > C, > Ca' Then if the cells in
an array are at Cu' a fixed local perturbation of level C p will generate a travelling
front if Cp > C,. Behind this front, aII cells will be at C,. An example is shown in
Fig. 8. Conversely, if the cells are initially at Cl' then a perturbation CI' < C, will
generate a travelling front behind which cells are at C()' Once established, these
fronts move across an array of cells at a constant speed that can be determined us-
ing linear analysis about the homogeneous states.

4.3 More Complex Spatial Patterns

The richest model in terms of de novo pattern formation is EBM with Pa and P,
monotonically increasing [13-16]. Like IBM with g monotonically increasing,
discussed in the previous section, this model can generate both spatial gradients
and travelling fronts in response to a locali sed perturbation to a stable homo gene-
ous steady state. However, in this case the existence of feedback on the level of
free receptor allows gIadients of unrestricted range to be formed (although the
time taken to form the gradient increases with the range). Details are contained in
Owen el al. [13, 14].
Spatial Patteming in Explicitly Cellular Environments 221

1800 0.9
1600 0.8
1400 0.7
1200 0.6
(])
E 1000 0.5
:;J

800 0.4
600 0.3
400 0.2
200 0.1
O
10 20 30 40 50 60 70 80 90
cell number
Fig. 8. A typical travelling front of receptor activity resulting from unilateral constant
stimulation of line of cells.

For certain forms of the feedback functions Paand P" the homogeneous steady
states of this model are unstable to spatially inhomogeneous perturbations, leading
to the formation of spatial patterns [15, 16]. Two types of perturbation to an array
of celIs at the homogeneous steady state have been studied. When a small pertur-
bation is applied to cells along a line in a two-dimensional array, a pattern propa-
gates out from this line. The resulting pattern consists of either continuous or bro-
ken Iines of cells with elevated levels of receptor activation, running parallel to the
line of initial perturbation (see Fig. 9a, b). When small random perturbations are
applied throughout the array, roughly periodic patterns, with a characteristic wave-
length, result (see Fig. 9c). These patterns are formed typically from locali sed
small collections of cells that have a higher level of receptor activation than the in-
tervening cells. Sensitivity analysis shows that both types of pattern are quite ro-
bust to changes in initial conditions and parameter values [15].
This mechanism of pattern formation, which has been termed lateral induction
by Owen et al., is a new and potentially important mechanism for spontaneous
pattern formation . Patterning depends on the establishment of a positive feedback
that enhances receptor activity, and on the spatiallocalisation of this enhancement.
This is achieved in the model by having strong feedback on free receptor (P,
steep) together with weak feedback on ligand (Pa shallow). A cell that develops a
higher level of free receptor than its neighbours (due to random tluctuation) wiU
consequently develop elevated levels of bound receptor by binding of ligand from
neighbouring cells. In turn, this will increase the level of free receptor further,
driving a positive feedback. Neighbouring cells cannot propagate due to the fact
that a high proportion of ligand on these cells is bound to the excess of receptors
on the initially perturbed cell. It is this fact that, in the case of weak ligand feed-
back, ensures that cells developing high levels of receptor activation remain spa-
tially isolated.
222 N. Monk

(a)

3000 6000

(b)

2000 6000

(e)

3000 6000

Fig. 9. Spatial patterns of ligand-receptor complex (measured in molecules per ceH) in


sheets of square cells resulting from the EBM equations with P" and P, monotonic increas-
ing. In a and b, ali cells are initially at a homogeneous equilibrium, apart from a small per-
turbation along the mid-line. Spots a and stripes b develop from the same initial conditions
for different values of the parameters in the feedback functions P" and P,. In c, initial values
of receptor and ligand are distributed randomly around a homogeneous equilibrium. The
characteristic wavelength of the resulting pattern depends on the form of the feedback func-
tions. a and b are solved on a 30 by 60 array; c is solved on a 30 by 30 array. Reproduced
from [15], wherein further details can be found

Interestingly, this type of model might be applicable to signalling through the


Notch receptor, in addition to the model discussed above. Depending on context,
Notch activation can sometimes result in up-regulation of both free Notch and
Delta, rather than the down-regulation of Delta [7,22-28].
Spatial Patteming in Exp1icitly CeHu1ar Environments 223

5 Further Developments

The mode1s discussed here can be extended in a number of directions to investi-


gate issues of biological importance. The exploration of some avenues has begun,
while others remain unexplored. Some obvious possibilities are:
1. The inc1usion of time delays into the equations. Specifically, if the regulation of
ligand and/or receptor activity is mediated through gene transcription, then po-
tentially significant time delays need to be inc1uded in the feedback functions
Pa and P, (Monk, in preparation).
2. The non-uniform distribution of ligand and receptor activities within the mem-
branes of the signalling cells (Monk, in preparation). Many cells within epithe-
lia are known to polarise within the plane of the epithelium [29]. One aspect of
this polarisation that is becoming apparent is the polarisation of signalling ac-
tivities [30, 31].
3. Non-monotonic forms of the feedback functions Pa and P,.
4. The inc1usion in the models of cell division and cell rearrangement (the rate of
which cou1d depend on the leve1 of receptor activation). While these processes
would be expected to occur on slower time-scales than the basic juxtacrine sig-
nalling process, they are known to have a significant impact on the final pattern
of differentiated cells (see, for example, [32]).
5. The study of more extensive juxtacrine signalling networks, involving the in-
teraction between a variety of juxtacrine and non-juxtacrine ligands. The work
of Salazar-Ciudad et al. [33] suggests that large randomly connected juxtacrine
signalling networks exhibit a rich array of spatio-temporal behaviours that are
more robust than those exhibited by diffusively coupled networks. In reality,
juxtacrine signalling interacts with diffusive signalling, and the resulting char-
acteristic behaviour remains to be explored (but see, for example, [34]).

References

1. Fagotto, F. and Gumbiner, B. (1996). CeH contact-dependent signaling. Dev. Bio/. 180,
445-454.
2. Krămer, H., Cagan, R. L. and Zipursky, S. L. (1991). Interaction of bride of sevenless
membrane-bound ligand and the sevenless tyrosine-kinase receptor. Nature 352, 207-
212.
3. Monk, N. A. M., Sherratt, J. A. and Owen, M. R. (2000). Spatiotemporal patteming in
mode1s of juxtacrine interceUular signalling with feedback, in Mathematical Models
for Biological Pattern Formation. (Ed. P. K. Maini and H. G. Othmer), pp. 165-192,
Springer, Berlin Heide1berg New York.
4. Weinmaster, G. (1998). Notch signaling: direct or what? Curr. Opin. Genetics Dev. 8,
436-442.
5. Heitz1er, P. and Simpson, P. (1991). The choice of ceU fate in the epidermis of Droso-
phila. Cell 64, 1083-1092.
224 N. Monk

6. Heitzler, P. and Simpson, P. (1993). Altered epidermal growth factor-like sequences


provide evidence for a role of Notch as a receptor in cell fate decisions. Development
117,1113-1123.
7. Heitzler, P., Bourois, M., Ruei, L., Carteret, C. and Simpson, P. (1996). Genes of the
Enhancer of split and achaete-scute complexes are required for a regulatory loop be-
tween Notch and Delta during lateral signalling in Drosophila. Development 122, 161-
17l.
8. Simpson, P. (1997). Notch signalling in development: on equivalence groups and
asymmetrical developmental potential. Curr. Opin. Genetics Dev. 7, 537-542.
9. Kumar, V., Bustin, S. A. and McKay, 1. A. (1995). Transforming growth factor alpha.
Cel! Biol. Intl. 19, 373-388.
10. Massague, J. (1990). Transforrning growth factor-a: a model for membrane-anchored
growth factors. J. biol. Chem. 265, 21393-21396.
Il. Clark, A. J. L., Ishii, S., Richert, N., Merlino, G. T. and Pastan, 1. (1985). Epidermal
growth factor regulates the expression of its own receptor. Proc. Natl. Acad. Sci. USA
82, 8374-8378.
12. van de Vijver, M. J., Kumar, R. and Mendelsohn, J. (1991). Ligand-induced activation
of A431 cell epidermal growth factor receptors occurs primarily by an auto..:rine path-
way that acts upon receptors on the surface rather than internally. J. Biol. Chem. 266,
7503-7508.
13. Owen, M. R. and Sherratt, 1. A. (1998). Mathematical modelling of juxtacrine cell sig-
nalling. Math. Biosci. 153, 125-150.
14. Owen, M. R., Sherratt, J. A. and Myers, S. R. (1999). How far can a juxtacrine signal
travel? Proc. R. Soc. Lond. B 266, 579-585.
15. Owen, M. R., Sherratt, J. A. and Wearing, H. J. (2000). Lateral induction by juxtacrine
signalling is a new mechanism for pattern formation. Dev. Biol. 217, 54-6l.
16. Wearing, H. J., Owen, M. R. arid Sherratt, J. A. (2000). Mathematical modelling of
juxtacrine patterning. Bul!. Math. Biol. 62, 293-320.
17. Collier, J. R., Monk, N. A. M., Maini, P. K. and Lewis, J. H. (1996). Pattern formation
by lateral inhibition with feedback: a mathematical model of Delta-Notch intercellular
signalling. J. theor. Biol. 183, 429-446.
18. Monk, N. A. M. (1998). Restricted-range gradients and travelling fronts in a model of
juxtacrine cell relay. Bul!. Math. Biol. 60, 901-918.
19. Luthi, P. O., Chopard, B., Preiss, A. and Ramsden, 1. J. (1998). A cellular automaton
model for neurogenesis in Drosophila. Physica D 118, 151-160.
20. Ermentrout, G. B. and Edelstein-Keshet, L. (1993). Cellular automata approaches to
biological mode1ing. J. theor. Biol. 160,97-133.
2l. Othmer, H. G. and Scriven, L. E. (1971). Instability and dynamic pattern in ceIlular
networks. J. theor. Biol. 32,507-537.
22. Huppert, S. S., Jacobson, T. L. and Muskavitch, M. A. T. (1997). Feedback regulation
is central to Delta-Notch signalling required for Drosophila wing vein morphogenesis.
Development 124, 3283-3291.
23. de Celis, J. F. and Bray, S. (1997). Feedback mechanisms affecting Notch activation at
the dorsoventral boundary in the Drosophila wing. Development 124, 3241-3251.
24. MiccheIli, C. A., Rulifson, E. J. and Blair, S. S. (1997). The function and regulation of
cut expres sion on the wing margin of Drosophila: Notch, Wingless and a dominant
negative role for Delta and Serrate. Development 124, 1485-1495.
Spatial Patterning in Explicitly Cellular Environments 225

25. Panin, V. M., Papayannopoulos, V., Wilson, R. and Irvine, K. D. (1997). Fringe modu-
lates Notch-ligand interactions. Nature 387, 908-912.
26. Christensen, S., Kodoyianni, V., Bosenberg, M., Friedman, L. and Kimble, 1. (1996).
lag-l, a gene required for lin-12 and glp-l signaling in Caenorhabditis elegans, is ho-
mologous to human CBFl and Drosophila Su(H). Development 122,1373-1383.
27. de Celis, J. F., Bray, S. and 8arcia-Bellido, A. (1997). Notch signalling regulates
veinlet expres sion and establishes boundaries between veins and interveins in the Dro-
sophila wing. Development 124,1919-1928.
28. Wilkinson, H. A., Fitzgerald, K. and Greenwald, 1. (1994). Reciprocal changes in ex-
pression of the receptor lin-12 and its ligand lag-2 prior to commitment in a C. elegans
cell fate decision. Ce1l79, 1187-1198.
29. Eaton, S. (1997). Planar polarization of Drosophila and vertebrate epithelia. Curr.
Opin. Cell Biol. 9, 860-866.
30. Usui, T., Shima, Y., Shimada, Y., Hirano, S., Burgess, R. W., Schwarz, T. L., Ta-
keichi, M. and Uemura, T. (1999). Flamingo, a seven-pass transmembrane cadherin,
regulates planar cell polarity under the control of frizzled. Ce1l98, 585-595.
31. Strutt, D. 1. (2001). Asymmetric localization of frizzled and the establishment of cell
polarity in the Drosophila. MoI. Cell. 7, 367-375.
32. Goodyear, R. and Richardson, G. (1997). Pattern formation in the basilar papilla: evi-
dence for cell rearrangement. J. Neurosci. 17,6289-6301.
33. Salazar-Ciudad, 1. , Garcia-Fernaodez, J. and Sole, R. (2000). Gene networks capable
of pattern formation: from induction to reaction-diffusion. J. Theor. Biol. 205, 587-
603.
34. Page, K. M., Maini, P. K., Monk, N. A. M. and Stern, C. D. (2001). A model of primi-
tive streak initiation in the chick embryo. J. Theor. Biol. 208, 419-438.
Modelling the GH Release System

D. J. MacGregor, G. Leng, D. Brown

School of Biomedical and Clinical Laboratory Sciences, University of Edinburgh,


Hugh Robson Building, George Square, Edinburgh EH8 9LD< UK
duncan.macgregor@ed.ac.uk

1 Introduction

This chapter describes a model of the hypothalamic and pituitary components in-
volved in controlling growth hormone release. The model has been developed by
gathering and attempting to formalise the experimental data on the system but has
been kept as simple as possible, focusing on the functional rather than mechanical
properties of its components. In this way it has shown that a relatively simple
model can be capable of producing complex behaviour and accurately reproducing
the behaviour and output of a real brain system.

2 Research Background

Much of the information communicated between cells and systems in the brain is
represented not so much by the amplitude of activity but by the pattern of activity
over some time period. This has perhaps always been a more obviously useful and
robust alternative but it also requires that we attribute memory to very low-Ievel
components of the brain.
One part of the brain that exhibits very definite temporally patterned activity is
the neuroendocrine system. This system also has the advantage of producing an
easily accessible and relatively simple output in the form of hormone release into
the bloodstream. We know that for several hormones, inc1uding growth hormone,
a pulsatile pattern of release is optimal for their actions within the body. In male
rats, as in humans and many other animals, Growth Hormone (GH) is released in
large pulses every 3h. A pattern of large pulses rather than continuous release al-
lows a maximal response to the hormone without desensitising the target receptors
and it is also believed that the pattern of mixed high and low activity may be used
to instruct different systems with some responding to high and some low activity.
These patterns are triggered by neurons but they operate on a timescale of hours,
far longer than the milliseconds over which action potentials fire. We need to un-
derstand how processes which operate at very different speeds can integrate with
each other and what the dynamics of such connections might be.
228 D. J. MacGregor. G. Leng. D. Brown

The neuroendocrine system forms the brain 's interface to much of the body's
int~rnal systems and its function is essential to many fundamental processes, par-
ticularly development and reproduction. Better understanding of these systems
will make us much more able to diagnose and treat patients with hormonal disor-
ders. Many children suffer from deficient GH release and in rarer cases over-
release of GH. We already have artificial peptides that can stimulate and control
GH release but gre ater understanding will allow us to administer them more effec-
tively and also to more finely diagnose the problems in individual cases.

3 GH Research

This model incorporates present biological understanding of the control of GH re-


lease in the rat, particularly the male, which shows a highly pulsatile pattern of re-
lease consisting of bursts of large pulses occurring every 3h or so separated by pe-
riods of very low GH levels. The female rat shows a more continuous level of GH
release consisting of smaller, less regular pulses and a higher basallevel of release
(Fig. 1). Close examination of data on female GH release does suggest periods of
larger and smaller pulses, much less defined than in the male, but following a
similar temporal pattern of alternating periods of high and low activity.

male female

- U
0.8

0.4
el
E
UJ
Z
O
0.8
c
- ~
O
~
o:: 0.4
O
I
I O .,.. AA. ft ,.,Ăe ~
1-
3: 0.8
O
o::
~ 0.4

O
O 2 4 6 8
~.4
O 2 4 6
, ~
8

TIME (h)
Fig. 1 : Redrawn from [28] GH release in conscious male and female rats.
Modelling the GH Release System 229

GH is released from the anterior pituitary under the control of two hypotha-
lamic peptides, GH releasing hormone (GHRH) and somatostatin. GHRH stimu-
lates GH release and also synthesis. Somatostatin inhibits release by blocking the
pathway by which GHRH acts. In this way, GHRH is responsible for individual
bursts of GH release but somatostatin seems to be responsible for the overall pat-
tern of release by exerting permissive control over the somatotroph release re-
sponse, and probably also through hypothalamic mechanisms.
These peptides are synthesised by two groups of hypothalamic cells, the GHRH
neurons in the arcuate nucleus and the somatostatin neurons in the periventricular
nucleus. These neurons have axons that project to the median eminence where
they release the peptides into the portal blood supply to be carried to the pituitary.
It was these projections that helped to identify the two groups of neurons. Unfor-
tunately the portal supply' s very small volume makes its content very difficult to
measure with any decent temporal resolution. In addition to this, GHRH neuron
activity is knocked out by anaesthesia making it impossible to record endogenous
behaviour. Recording from somatostatin cells is particularly difficult because they
are spread in a very thin layer. These problems mean that there is no direct way to
discover the real patterns of GHRH and somatostatin release in the working sys-
tem and so more indirect methods have to be employed.

3.1 Experimental Approach

The experiments that have been used to investigate the system can be divided into
three groups, behavioural, electrophysiological and anatomical. Behavioural ex-
periments look at how the system as a whole works, and so the final output, GH
release into the bloodstream, is measured. These are invivo experiments, usually in
conscious animals, which consist of adding or knocking out substances or path-
ways to test what effect they have on GH release.
In electrophysiological experiments we go directly to the neurons thought to be
involved in the system to measure their activity. Although the ability to do this is
limited in the GH system, we do still have the options of recording and stimulating
the GHRH neurons and stimulating the somatostatin neurons, since stimulation
requires less accurate electrode placement. We can still test the GHRH neurons'
response to artificial stimulation in order to determine their properties even though
we can't record normal activity.
Anatomical experiments investigate the system at a lower level, determinmg
the properties of individual components, such as groups of cells, and the connec-
tions between them. This will include a range of molecular and cellular techniques
used to find out what types of receptors each group of cells has, what substances
act on them and what substances they release. This is probably the most difficult
level at which to draw any definite conclusions and individual experimental re-
sults will really only give clues. A single part of the brain can hold groups of cells
performing many different tasks and it can be difficult to be sure that results relate
to the system under investigation. Many substances will appear to act on the cells
being examined, but only some of these will pIay a real role in the action of our
system. The best evidence comes when severai experiments point to the same
230 D. 1. MacGregor, G. Leng, D. Brown

conclusion from different directions, i.e. we find the receptors for a substance, the
substance has a functionally useful effect on the cells, and another group of cells
in our system releases this substance.

3.2 Anatomical Results

Two main groups of hypothalamic neurons controlling GHRH and somatostatin


release have been identified. The main group of GHRH neurons is in the arcuate
nucleus, forming a fairly large group of around 1,500 cells [1]. There are some
other GHRH-containing cells in the hypothalamus but these do not appear to pro-
ject to the median eminence [2]. The main somatostatin neurons are in the
periventricular nucleus (PeN), spread in a thin layer close to the third ventricle [3].
There is another smaller group of somatostatin neurons in the arcuate nucleus, but
these also do not project to the median eminence. However, their close location to
the GHRH neurons suggests that they may play some role in the system.
There is good evidence at this level that GH feeds back directly to the soma-
tostatin neurons. GH receptor mRNA has been detected in the PeN; i.c.v. GH re-
ceptor antisense reduces somatostatin mRNA; administering GH to an invitro PeN
preparation increases somatostatin release and mRNA levels [4], and an in-vivo
study has shown an increase in c-fos [5], indicating increased neuronal activity in
the PeN following i.v. GH. These results suggest that GH itself acts at the PeN to
stimulate somatostatin release and synthesis. It is not known how GH could be
transported from the peripheral release site at the pituitary to the hypothalamus.
Another central connection for which there is evidence is from the somatostatin
neurons to the GHRH neurons. In general, many results show effects that cannot
be determined between a GHRH or somatostatin site of action, suggesting the pos-
sibility of a link. More direct low-Ievel evidence includes the co-Iocalisation of
somatostatin receptors with GHRH neurons [6], and also somatostatin causing a
reduction in GHRH mRNA at the GHRH neurons.
Many other substances have an effect on GH release, either by direct actions at
the pituitary or by acting on GHRH or somatostatin release in the hypothalamus.
However, it is likely that only a few of these actually play a role in normal func-
tion of the system and so we look for those that have more complete evidence
about their function such as site of release, site of action and strength of effect to
identify those that might be important to controlling GHRH and somatostatin re-
lease. Several substances have been co-Iocalised with GHRH, including tyrosine
hydroxylase (TH, used to synthesise dopamine or L-DOPA) [7], GABA [8], ace-
tylcholine and galanin [9]. Dopamine has an inhibitory effect on GH release and
has been shown to stimulate somatostatin release [10]. However, it also apparently
increases GHRH release when administered in conjunction with somatostatin an-
ti sera [11]. This may be evidence that dopamine is involved in some interaction
between the GHRH and somatostatin neurons. GABA also has an inhibitory effect
on GH, but appears to act on the GHRH neurons themselves, possibly mediating
an auto-inhibitory action. Acetylcholine may act to inhibit somatostatin, as has
been demonstrated invitro [12]. Galanin has also been shown to inhibit soma-
tostatin release, with the consequent stimulatory effect on GH.
Modelling the GH Re1ease System 231

The central adrenergic system also appears to play a role in GH release, possi-
bly acting as direct input to GHRH neurons. The centrally acting adrenergic recep-
tor agonist clonidine increases GH release [13] and receptor antagonists [14] and
adrenaline synthesis inhibitors [15] block GH release. The action of clonidine is
blocked by GHRH antisera, but not somatostatin antisera, pointing at the GHRH
site of action.

3.3 Electrophysiological Results

We can't measure endogenous GHRH neuron activity but what has been done is
to stimulate the GHRH neurons with varying pattems of signal and measure the
GH release in response. By comparing this with normal GH release we can at least
narrow down the range of feasible pattems of activity that might exist in the natu-
ral system, although the detail of any conclusions is limited by the temporal reso-
lution at which we can measure GH release, on a scale of minutes compared to the
much finer timescale on which electrophysiological experiments can work.
Stimulating the GHRH neurons at varied pulse frequencies shows no difference
in GH release levels but increasing the number of pulses produces a large non-
linear increase in GH release [16]. Giving larger doses of GHRH directly to the pi-
tuitary does increase GH release but not in the same non-linear fashion suggesting
that the non-linear relation is at the le vei of the hypothalamus or may also involve
somatostatin.
The other type of experiment which has been conducted is stimulating the
periventricular nucleus and testing the response in the arcuate neurons [17]. Prob-
able GHRH neurons are identified among the arcuate cells by testing for a re-
sponse to stimulation at the axonal terminals in the median eminence. Most of the
GHRH neurons are inhibited during periventricular stimulation and also show a
rebound hyperactivation after the periventricular stimulation.

3.4 Behavioural Results

Behavioural experiments involve measuring GH release in conscious animals,


mostly male and female rats, and testing how release is affected by artificially ad-
ministered GHRH, somatostatin and GH itself. Some experiments have also used
antibodies to knock out endogenous GHRH or somatostatin in order to isolate the
actions of exogenous substances from normal hypothalamic release.
Experiments using GHRH antiserum and artificially administered GHRH prove
the excitatory role of GHRH in pituitary GH release. With a sufficient dose of
GHRH antibody GH release is completely abolished and smaller doses reduce the
amplitude of GH pulses [18]. The direct GH response to artificial GHRH has been
tested by using antibodies to knock out the endogenous GHRH and somatostatin.
Each GHRH injection produces a pulse of GH release in response, which in-
creases in amplitude with larger concentrations of GHRH up to a maximal level
[19]. The response to repeated injections varies and this is likely due to desensiti-
sation at the pituitary GHRH receptors or some remaining somatostatin. These re-
232 D. J. MacGregor, G. Leng, D. Brown

sults tell us that GH pulse amplitude is related to the size of GHRH pulse and also
that each GH pulse requires a corresponding pulse of GHRH release.
If somatostatin alone is knocked out, the low basal GH secretion between bursts
in males is increased [20], suggesting that somatostatin is responsible for the re-
duced GH release, rather than lower GHRH activity. This also means that there is
either a certain level of GH release from the pituitary independent of GHRH, but
modulated by somatostatin or that there is stiH GHRH activity between GH bursts
which is being controlled by somatostatin either at the hypothalamic level or at the
pituitary.
Giving GHRH injections to male rats without knocking out endogenous soma-
tostatin produces very variable GH pulses. Female rats, however, produce more
regular GH pulses in response ta GHRH. These are smaller than the largest GH
pulses observed in males but are also larger than the smallest GH responses in
males. If a female rat is given three-hourly injections of GHRH over several days
it wiH produce a pattern of GH release very similar ta the natural pattern observed
in males [21]. The animals become entrained into the artificially stimulated pattern
eventually showing no endogenous release between induced GH pulses. This
would require a change in the patterns of hypothalamic GHRH and somatostatin
release and sa there must be some hypothalamic action either by GHRH directly
or through some other feedback mechanism.
If a male rat is given repeated injections of GHRH it will continue ta only pro-
duce a large GH response every 3 h., even with much more frequent GHRH injec-
tions [22]. This refractory period is not likely to be due to a depletion of GH at the
pituitary since it takes 24 h. of continuous GHRH before significantly depleting
pituitary GH content, and female rats continue to respond to even frequent GHRH
injections. It is more likely that the refractory period is due to a cyclic pattern of
somatostatin release.
When male rats are infused with a sufficient doses of somatostatin, GH release
is abolished [23]. At a lower does the endogenous pattern of release remains, but
with smaller amplitude GH pulses. When the somatostatin infusion is stopped
there is a large rebound pulse of GH release, which increases in size with the
somatostatin dosage. A small rebound GH release in response ta somatostatin
withdrawal can be observed with pituitary preparations invitro, but this is of much
smaller magnitude than the invivo rebound. The rebound effect is reduced by the
addition of GHRH antiserum and also by urethane anaesthesia [24], which knocks
out hypothalamic GHRH activity, suggesting an active hypothalamic component
rather than just the removal of the somatostatin inhibitory influence at the pitui-
tary.
If female rats are given a three-hourly pattern of somatostatin infusions, on for
150 min and off for 30 min they produce a pattern of GH release very similar ta
the male pattern. Using a more sinusoidal pattern of somatostatin does produce
three-hourly bursts of GH release but more extended and of lower amplitude than
those observed in the male. These results alI support the idea of high levels of
somatostatin between GH bursts, the inhibitory effects of which start and stop
fairly abruptly, either because of the dynamics of somatostatin's effect within the
hypothalamus, or directly because of its release pattern.
Modelling the GH Re1ease System 233

BehaviounJ experiments have also demonstrated that GH can have a major ef-
fect on its own release. Giving an infusion of GH to male or female rats sup-
presses the endogenous pattern of release [25]. However, unlike GHRH and soma-
tostatin which act fairly immediately, the effect is over quite a long timescale,
taking in the range of 1 h. to develop, and 1 to 2 h. for normal release to recover
after the infusion.
What is uncertain is whether the suppression works by increasing somatostatin
release or by inhibiting GHRH release. In female rats, when GHRH injections'are
given during a GH infusion there is still a large GH response to each injection, al-
though occasionally with some reduction in amplitude compared to injections be-
fore the GH infusion [26]. If somatostatin is infused then there is no response to
the GHRH injections. However, in male rats, which are always only intermittent1y
responsive to GHRH injections, only one GHRH injection produces a GH pulse
during a 6 h. infusion of GH. The response they do give, compared to responses
before and after the infusion, seems to follow the natural three-hourly pattern of
GHRH responsiveness, but with an extended refractory period. This suggests more
than just suppression of GHRH by the GH infusion, perhaps prolonged soma-
tostatin release. Smaller pulses of GHRH during GH infusion in females still pro-
duce large GH pulses, evidence against somatostatin stimulation. The response to
GH seems to vary across gender, which given the general variation in GH release,
makes GH a good candidate for an endogenous feedback mechanism. There are
other substances such as IGF (insulin-like growth factor) which are also thought to
mediate GH feedback but these act too slowly to directly control the normal re-
lease pattern. If GH is given in more natural injections, rather than infusion, then
after one or two three-hourly injections, the rats become entrained in a similar
manner to repeated GHRH injections, producing pulses of GH in synchronisation
with the injections [27]. More frequent 1.5-hourly injections cause the period be-
tween GH pulses to increase, instead of synchronising to the more frequent pat-
tern. This suggests that it may be GH itself that triggers the 3 h. refractory period.

4 Creating the System

The first stage in building a system model is to define the desired output. The
model GH system needs to be able to reproduce the pattern of GH release in the
rat. The male pattern is the best one to use initially, sin.::e this is a better defined
pattern, but we would also hope that it can easily be adapted to reproduce the fe-
male pattern. The GH system in the female is likely to be similar with only a few
variations which produce the different behaviour. GH release in the male rat oc-
curs in three-hourly bursts of severallarge pulses of GH over roughly 1 h. (the pe-
riod of apparent pulse activity varies in different results). Between the bursts there
is very low basal release of GH. In modelling a whole system at this level it is
more important to be able to reproduce the characteristics of the real data rather
than exact quantitative details.
The next stage is to lay out the system' s components and what we know of how
they behave and interact with each other.
234 D. J. MacGregor, G. Leng, D. Brown

What we know about the system :


• GHRH stimulates GH release at the pituitary.
• Somatostatin inhibits GH release at the pituitary.
• Somatostatin has an inhibitory connection to the GHRH neurons.
• GH exerts negative feedback on its own release.
• GH has a stimulatory effect on the somatostatin neurons.
These pieces give the basic model in Fig. 2.

hypothalamus

+
Portal blood
supply

Fig. 2.: A basic model of the GH release system

This ba sic model is used to try and think of a working system. The pattern of
output repeats in a cycle and so in the absence of any outside generator producing
this pattern the system needs to have its own cycle. GHRH triggers GH release,
which feeds back to increase somatostatin release, which would inhibit GHRH and
GH release. EventualIy the somatostatin stimulated by GH would falI away and
GHRH would once again trigger GH and repeat the cycle. However, what this is
missing is an input for the GHRH pulses, and so we go back to the experimental
data. Here we find two possibilities, an adrenergic input and the rebound GHRH
Modelling the GH Release System 235

spike triggered after inhibition by somatostatin. In modelling it is considered best


to first of all go with the simplest solution, in this case the adrenergic input. This
follows the precedent set by the use of a noradrenergic input in a model of the LH
release system and also general knowledge that adrenergic inputs appear to act as
triggering inputs in many other systems. The data suggest that this input should be
a series of short pulses, but of course this can be experimented with. Adding this
input, on paper at least, gives a system sufficient to produce pulsatile GH output
and so the model goes on to be implemented, first as a set of variables and equa-
tions and then on computer.

4.1 Simplifications

This is a model of the system at a functional rather than mechanicallevel, and ma-
jor simplifications have been made in representing the system's components.
GHRH and somatostatin are both released and controUed by groups of thousands
of neurons but each is represented by a single variable which measures the level of
the peptide in the system. We are assuming that each group of neurons can be
treated as a single unit. This works in this system for several reasons, the main one
being the timescale of the model. The model works on a scale of minutes, whereas
neurons work on milliseconds and so activity is averaged out. We know that the
neurons do synchronise because this is necessary in any system to produce pulsa-
tile output. Without synchronisation changes in overaU activity will be much less
sudden. The other major reason is the way in which electrical activity is trans-
duced into hormone release. Although individual action potentials trigger individ-
ual releases of peptide, the release from individual neurons aU diffuses into a
common transport channel to the pituitary, and so the GH releasing cells experi-
ence overall GHRH activity rather than the actions of single cells.

5 Making the Experimental Model

The state of the system is stored as a set of variables that represent measures in the
system such as hormone blood level, electrical activity or number of free recep-
tors. The functions of each component may be further broken down into more
variables but this will only be done with the aim of reproducing the component's
behaviour rather than just its mechanism. The basic GH model based on the dia-
gram above has five variables, representing the blood concentrations of GHRH,
GH and somatostatin, the level of releasable somatostatin and the number of free
GHRH receptors at the pituitary. This last variable comes from previous work,
which developed a model of the pituitary component of the system, relating GH
release to the levels of GHRH and somatostatin.
Each variable has a corresponding differential equation which models its be-
haviour and these equations will contain parameters which relate to some measure
such as a threshold or synaptic strength or may belong to a more obscure mathe-
236 D. 1. MacGregor, G. Leng, D. Brown

matical construction. The model also contains other equations, which calculate
values such as level of receptor activation for use in the differential equations.
Hill equations are used to model the actions of the peptides on the groups of
neurons. In general, these equations model the effects of ligand-receptor binding
and allow a variable threshold for activation and variable steepness using the Hill
coefficient which measures the degree of cooperativity of the binding. If binding
is cooperative, then the binding rate is affected by the current level of bound com-
plex, so if the effect is positive, the binding rate will increase as the level of bound
complex increases. If the effect is negative then the binding rate is reduced. Essen-
tially, this controls the steepness with which the level of binding increases. These
equations give a biologically realistic, but still simple, way of getting a measure of
activation where there is a substance binding to receptors. The value ranges from O
to 1, with a higher value indicating a higher level of binding, or activation.

5.1 Storage Variables

For this model a type of variable has been developed which represents a measure
of the ability to release a substance, or 'releasability'. This is deliberately vague
because it is intended to represent behaviour rather than a particular mechanism.
The real biological substrate could be something like vesicle mobilisation or the
number of activatable receptors. The idea is that they allow a substance to charge
up the ability to release something without directly triggering its release. They
form a kind of memory that allows a substance to trigger an effect which takes
place over a longer timescale or at a different time point to when the original trig-
ger substance was experienced. This was originally developed in order to model
rebound effects. The storage variable could be charged up during inhibition while
its release was blocked and then allowed to drain quickly after inhibition to pro-
duce a large rebound spike. It has also been used to model the effect of GH on
somatostatin, by getting GH to charge up a somatostatin release variable so that
the relatively short period of GH pulses can trigger a much longer period of soma-
tostatin release.

5.2 Input Protocols

Even fairly complete models of self-contained systems usually need some sort of
external input to control them. When the system of equations is run as a computer
program the variables are progressively calculated at discrete time points. Inputs
are usually given to the system by perturbing the variables, changing them to a
specified value at a specific time point. A whole series of inputs can be defined in
order to form a pattern, such as a series of spikes and this whole pattern of inputs
is known as the protocol. Each variable can have its own protocol but usually only
a few of them will be controlled in this way.
Modelling the GH Release System 237

6 The Model

The GH system model builds on previous work, which developed a model of GH


release at the pituitary using GHRH and somatostatin levels as inputs. This uses
three variables for GHRH, somatostatin and GH blood levels and a fourth which
gives a measure of free GHRH receptors at the GH releasing cells.

The variables are:


r - GHRH blood level
s - somatostatin blood level
h - GH blood level
f - free GHRH receptors
An earlier version of the pituitary model used an extra variable for the number
of activated receptors but this was simplified by scaling the total number of recep-
tors to 1, so that the activated receptors would be l-f The pituitary model was
successfully fitted to real GH release data, mostly from invitro experiments. This
gives a reliable base on which to build the hypothalamic components.

6.1 The Pituitary Model

The concentrations of GHRH (r) and somatostatin (s) are modelled by equations,
which represent release rate and the rate of clearance in the bloodstream:

dr
-=1, -k6 r (1)
dt
ds
-=1 -k s (2)
dt s 7

1, and I, are the release rates. The values k6 and k7 give the clearance rates, mod-
elled as proportional to the current concentration.
The effect of somatostatin on the processes of receptor recycling and GH re-
lease is modelled by calculating a level of somatostatin activation <I>(s), which is a
non-linear function of somatostatin concentration :

1
<D(s)=----- (3)
1 + e -(loglO S-So) / "'o

This equation produces a form of sigmoid. The level of activation varies be-
tween O and 1. After s reaches a certain value, controlled by s", the activation
value will increase quickly and then level off at a plateau. In many biological sys-
tems the effect of a substance varies as its log rather than linearly, i.e. at low con-
centrations a small increase has a large effect, but the same increase at a larger
238 D. J. MacGregor, G. Leng, D. Brown

concentration has less effect, and so the base 10 log of sis used instead of just s it-
self.
The number of free GHRH receptors, f, is modelled by :

df =-k](r+c)f +(k2 +k3<l>(s))(1- f) (4)


dt
The first part of the equation represents the rate of receptor binding, modelled
as the product of the number of free receptors and GHRH concentration at rate k,.
This reduces the number of free receptors and so is negative. The extra value, c, is
added to the GHRH concentration in order to represent constitutive activation of
the GH release pathway, which produces bas al release. The second part represents
receptor recycling which is the product of the number of bound receptors, (1-.1),
and the sum of the normal rate, k2 , and the rate due to somatostatin kJ<D(s).
The final equation gives the system's output, the rate of GH release :

dh
-=[k4 +k,(l-<l>(s))][r+c]f (5)
dt .
This models GH release as the product of the rate of receptor binding, (r + c)f
and asum that allows release at rate k4 independently of somatostatin and extra re-
lease at rate ks which is modified by somatostatin activation. Normally k, would be
much larger than k4 making the somatostatin blockable component much more
significant.

6.2 The GH System Model

The full model adds to the pituitary model the hypothalamic connections and
components which control the GHRH and somatostatin neurons. One new storage
variable, S" is used to give a measure of the ability of the periventricular soma-
tostatin neurons to release somatostatin. The model adds two new connections, an
inhibitory link from somatostatin to the GHRH neurons and a feedback link from
GH to somatostatin that increases the level of releasable somatostatin. It also adds
a delay to the GH level representing the time it takes from pituitary release to have
an effect on the somatostatin neurons. Without this delay GH will effectively auto-
inhibit by causing the somatostatin level to immediately rise, making the GH pulse
much smaller and shorter. The new links are modelled by two Hill equations, ra"
and Saci' which give the level of activation at the GHRH and somatostatin neurons
respectively:

(6)

(7)
ModelJing the GH Release System 239

The parameters thI and th2 give the thresholds for activation (really the level
for half maximal receptor binding) and nI and n2 are the Hill coefficients giving
the steepness with which the level of activation increases. The r equation is sub-
acl

tracted from 1 because it is being used for inhibition. The new equation for GHRH
release adds the connection from somatostatin by allowing r to modify the input
acl

activi ty 1,.

dr
dt = Irract - k 6r (8)

The level of somatostatin activation by GH, h'lCl is used in the new s, equation :

(9)

Instead of directly affecting somatostatin release, GH charges up s, so that there


is a large pool of releasable somatostatin that can be gradually released to give the
prolonged period of low GH release activity. Charge increases at a rate propor-
tional to GH receptor activation (har,} controlled by the parameter S". The level of
charge decreases with somatostatin release at a rate controlled by srr The new
equation for somatostatin release combines the input activity, I, with s, in a similar
fashion to the GHRH equation :

(10)

Combined with the pituitary model these equations implement the basic GH
system model described in Fig. 3.
240 D. J. MacGregor, G. Leng, D. Brown

Random
adrenergic constant

~ Electrical inputs
~
B~
, ~----------J>_ G ..-- +

G
+

Fig. 3. The GH system model. The model uses a previous model of the pituitary GH release
response to GHRH and somatostatin, from earlier work by Elinor Stephens. It includes, in
addition to the basic components, a random adrenergic input to the GHRH neurons, con-
stant electrica! input to the somatostatin neurons, a variable representing the releasable
store of somatostatin, s" an inhibitory connection from the GHRH to somatostatin neurons
and a delay value for GH feedback. This figure also shows the experimental inhibitory link
from GHRH to somatostatin

7 Working with the Model

The set of equations has been implemented as a computer program that runs the
equations over a fixed number of time steps, recording the value of each variable
at each step. Each variable can then be displayed as a graph which shows how it
changes with time and the graphs can be viewed together to examine how the be-
haviour of the variables relate to each other. The model' s parameters and the input
protocols can be altered and the model immediately run again to show what effect
the change has. The equations themselves can be altered by editing the program
code and then restarting the model program.
Modelling the GH Release System 241

7.1 The Model Parameters

The parameters from the pituitary model were fixed by previous work which fitted
the pituitary model to real invitro experimental data. The pituitary model is able to
match the dynamics of its individual components in real data and also the behav-
iour of the pituitary GH release system as a whole, specifically desensitisation to
GHRH, and the smaller pituitary version of rebound GH release following soma-
tostatin withdrawal. The pituitary model and its parameter values are assumed to
be a sound basis on which to build and test the hypothalamic components of the
model.
The experimental data on the hypothalarnic systems only indicate the existence
and overall effect of individual substances or connections and so there are no data
from which to direct1y derive values for the hypothalamic parameters. The values
have had to be determined by altering them so that the model produces the desired
pattern of GH output with plausible patterns of GHRH and somatostatin. It has
been attempted to retain biological plausibility with alI the parameter values. The
Hill equations can be made to behave like switches if the Hill coefficients are very
large but the dynarnics of individual receptor-triggered processes are seldom this
sharp and so the coefficients have been kept at small values. The thresholds of
these equations have been set so that an appropriate level of the triggering sub-
stance (somatostatin or GH, in the current model) is able to get the desired re-
sponse, i.e. blocking GHRH release or sufficient1y charging releasable soma-
tostatin. It is also desirable to put parameter values in a range where small changes
do not have a major effect on behaviour. Biological systems need to be robust and
the model should be the same. This characteristic is often a good indicator that a
model is a good lepresentation of the real biological system.
Many of the parameters will depend on each other, hence there are likely to be
many sets of parameters that will produce the same behaviour. This goes some
way to limit the range of behaviour but it will still be vast, increasing exponen-
tially with the number of parameters. Testing the model' s range of behaviour, be-
yond the target pattern of GH output, through varying the parameter values is an
essential part of the experimentation in order to understand the significance of the
model's components and their parameter values. What is desirable is some auto-
mated system to search the whole space of alI possible parameter values and dis-
cover the model's full range of behaviour. To do this we need to define precisely
how to assess the model's performance so that it can be implemented as part of the
model program. The important characteristics of the GH output pattern need to be
defined.

7.2 Assessing Performance

The goal of the model is to be able to reproduce the pattern of GH release in the
rat, Modelling is mainly concerned with reproducing the qualitative characteristics
of the data but it should at least show the same relative differences in levels of
hormone. A set of characteristics must be defined that is sufficient to differentiate
the patterns that meet the requirements from those that do not. Since this is nu-
242 D. 1. MacGregor, G. Leng, D. Brown

meric data, these will at least need to start out as measurements, so we list what
measurements might determine a male rat type pattern :
• Pulse amplitude
• Pulse duration
• Burst duration
• Number of pulses per burst
• Inter-burst GH level
• Inter-burst duration
The first stage is probably deciding whether regular bursts exist at alI. If this is
established then the individual bursts can be identified and then the measurements
can be made. A male rat type pattern requires approximately three-hourly bursts
with several high amplitude pulses and very low levels of release between bursts.
The female pattern may not have regular bursts but should at least have fairly fre-
quent pulses of at least a third of the amplitude seen in the males. These judge-
ments are easy to make with the eye but are difficult to sufficient1y formali se for
the sake of conventional computer implementation.

7.3 Initial Results

The basic version of the GH system model uses the equations described above
with regular patterns of input to the GHRH and somatostatin neurons to success-
fully produce a pattern of GH release similar to the male rat. The somatostatin
neurons are given constant fixed stimulation so they release somatostatin when-
ever s, is charged, i.e. somatostatin ready for release after being charged by GH.
The GHRH neurons get a regular pattern of short pulses, 1 or 2 min long and usu-
ally spaced 15 min apart. These pulses reproduce the multiple peaks observed in
GH release but they do not directly control the timing of the GH bursts. Fig. 4 il-
lustrates the model output.
The model begins running with somatostatin already charged to a low level and
so the frrst GHRH pulses are inhibited, but when the somatostatin level falls as s,
discharges the Somatostatin-GHRH inhibition is removed and pulses of GHRH
are released triggering several pulses of GH release from the pituitary. After the
30 min delay the GH pulses cause s, to recharge and the somatostatin level in-
creases again, ending the GH burst.

7.4 Comparison with real GH release

In Fig. 5 the model produces a similar pattern of GH release to the male rat. It
shows the same multiple pulse bursts of GH release following a similar timescale.
In Fig. 5 there is approximately 2.5 h. between bursts, but by increasing the delay
to 40 mins the model can be made to match the three-hourly pattern. How closely
we can analyse the shape is limited by the resolution of real data. Most results
measure the GH level at intervals of at least 5 mins, and so although we do know
that there are multiple peaks, we do not know how sharp these really are or how
Modelling the GH ReIease System 243

suddenly the GH burst starts and finishes. The one difference with the model that
is apparent is that in real data the first pulse in the burst is often the largest
whereas in the model the remaining somatostatin when the first GHRH pulses fire,
causes the first pulses to be smaller. This is less apparent when the model uses
random intervals for the GHRH input pulses and the random pulses make a better
match to real data. The effect is mainly controlled by the Hill equation which
models the somatostatin-GHRH link, raCl. If the binding coefficient, nI is made
larger then the inhibition is much sharper, acting like a switch and eliminating the
partially inhibited GHRH pulses.

R.n ..

,O),
.r,
'",o),
:~1 )/. III
'IlO
'" ",

orn
HO
"ca
• ., ", 'OI ",

."',rn .
".---------~--------
o !)l. II)) ,~ ,:xl

". ,
1(°1
~: i ~--- -r'''': _
o lCO l!iO ZXI

4(Oi

".
+

~lO
'DO

Fig. 4. The model program. This shows the software developed to run the model, running
on a PC in Windows. The main window displays graphs of the variables as the model's
outpUL The boxes to the right allow aII of the model's parameters and input protocols to be
modified. The boxes to the Iert of the graphs allow the scale to be altered, allowing to the
viewer to zoom in or out on the data. The info box displays the values pointed at by the
mouse
244 D. J. MacGregor, G. Leng, D. Brown

! \1 111111111111111111111111111111111 i ~I IIIIIIIIIIII~IIII i 1

Il ,III ,II, ,III Il III I I .1 il

HI =r:rvJ HI=- ~ ~ d
Il ~ Il ~ ~
oo
,J.
100 200
,J.
Thle (mlnul")
300 400
I
500 OO
J
100 200 300
Tlmo(mlnutos)
400

Fig. S. Results using the current basic version of the model, regular (Ieft) and random poi-
500
I

son (right) pulses of input to the GHRH neurons, GH feedback direct to s" and no link
from GHRH to somatostatin. GHRH pulse interval 15 min, GH feedback delay 30 min.
This gives a regular pattern of GH release bursts, with spacing similar to real release pat-
terns in the male rat. With the random input pulses the bursts vary in shape and pulse am-
plitude but the three-hourly pattern remains

7.5 A GHRH-Somatostatin Connection

There is indirect experimental evidence for an inhibitory link from GHRH neurons
to somatostatin neurons. It was not obvious without testing what effect this would
have on the model.
A new Hill equation, saCI' was used to model the action of GHRH on soma-
tostatin :
n3

sacI
=1- 3
r 3 (11)
r n + th3 n
This is similar to raCl' with its own parameters for threshold and binding coeffi-
cient, th3 and n3. This is added to the somatostatin equation by allowing saCI to
modify the release component :

(12)
Modelling the GH Release System 245

The effect of this new component has been to make GHRH bistable, producing ei-
ther large or very small pulses rather than a range controlled by the somatostatin
inhibition. This is similar to making the somatostatin inhibition sharper, but makes
the partiaIly inhibited GHRH pulses larger instead of removing them. Small
GHRH pulses are too small to inhibit somatostatin but larger pulses that are par-
tially inhibited are able to inhibit somatostatin, thus removing their own inhibition
and becoming very large pulses. This in turn increases the level of GH release and
so increases the GH effect on somatostatin prolonging the period between bursts
(Fig. 6). This works particularly well with random GHRH input pulses, countering
the variation in GHRH pulse size and making the GH bursts much more regular in
shape and spacing.

f ~ 1IIIIliiliiiiiiiiiliiiiiiliiiiiii i~I I I 1.11111.1 ~ 1~III.II1i !

i 1 .111 II II 1 i 1I I I I III
i~l~J\~B I~l ~ ~0J
il l ~ Â i~[ . l ~ ~I
oo 100 200 300

T1mo (nWMrt..)
_
I
500 oo 100 200 300

T1mo (_ )
_ 500

Figure 6. As Fig. 5 but with the addition of an inhibitory link from the GHRH to soma-
tostatin neurons, suggested by experimental results. This creates a bistability in GHRH ac-
tivity, which produces much larger GHRH pulses and in turn larger pulses of GH release. It
slightly increases the time between GH bursts

7.6 GH-Somatostatin Stimulatory Connection

Some experimental evidence suggests an additional effect of GH on somatostatin


directly stimulating release and with a much shorter delay. The work with the
model which suggested using the delay in the first place shows that with no delay
this would have a negative effect, causing GH pulses to inhibit themselves. How-
ever, if there was a short delay, then it could act to create a refractory period be-
246 D. J. MacGregor, G. Leng, D. Brown

tween pulses within the GH burst. The model has been used to investigate this by
adding a new GH activation equation, hacr2 and allowing this to modify the stimula-
tory component of the somatostatin equation.

h = hdelay2
n4
(13)
act2 hdelay2
n4
+ th4
n4

(14)

By using the product to modify the re1ease component this gives GH fuU con-
trol over somatostatin release, since 1, is just a constant. To combine the two rather
than replace the sum would be used instead. Initial testing with the model bears
out the predictions. The long bursts of GH release are replaced by frequent, short,
low amplitude pulses. GH inhibits itself and also Ioses the Iong periods of Iow ac-
tivity since there is no stimulus to maintain somatostatin release between GH
puIses (figure 7). However, there is an interesting effect if th4 is made very small
so that the Iow basallevels of GH release can stimulate somatostatin release. This
retums burst release and also extends the periods of low GH release by making the
rate at which somatostatin release decays more linear. The decay rate will nor-
maUy be non-linear because re1ease is proportional to the amount of charge left on
S, but at the same time GH release is gradually decreasing with the faU in soma-
tostatin. A1lowing GH to stimulate means that, while the pool of reIeasabIe soma-
tostatin is decreasing, the stimulation is increasing and the two effects combine to
create a longer more linear dec1ine in somatostatin release. It is unlike1y that such
low levels of GH have an effect in the real system but this is a good illustration of
how a model can find unexpected behaviour.

8 Conclusions

Modelling at the system level provides a powerful tool for making use of experi-
mental data. It forces the conc1usions from real data to be formali sed and provides
a structure for testing these conc1usions. It illustrates how important it is to assess
components as part of a whole system rather than in isolation and it provides a
method for doing this. When a model system is built it can then be used to predict
the results of experiments so that we can better direct investigations in the real
system or simulate experiments which cannot be done in animal models. Model-
ling is the practical form of the more functional rather than implementation based
language which is required to examine the brain at a higher level, looking at whole
systems rather than just biological components. It carries the risk of oversimplify-
ing, but this should not be a problem so long as the model' s design has a proper
basis in experimental results. BioIogy is elegant and much of the apparent hope-
Modelling the GH Release System 247

less complexity disappears when pieces are successfuIly distilled down to their
functional descriptions.

~ ~ f ~1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 i1 1
II II 11111111 II IIII I

"
~l+-LI-"-'l . L. !JI I~ I li. .,
J:
a:
J:

-"-----II i11 "Iii I

Il,."'~, ~u" II ~==:J


--',...I.1,I1....L...1I1LU..,.1I.-,,-1. 11 --,-
1 .&-,-
1

It oo ,'00" , "'200"

line (mlooos)
" 300 ' 400
I
500
~~I ~
oo '00 200
A
:)00

line (minute.)
400
I
500

Fig. 7 : Direct GH feedback to somatostatin. On the left, for comparison the same random
input pattern as Fig.5 is used. The tonic somatostatin electric al stimulation is replaced by
direct stimulation from GH. Making somatostatin release entirely dependent on GH feed-
back breaks the ability to maintain a prolonged somatostatin release and the immediate in-
crease in somatostatin release with each GH pulse causes the pulses to be shortened and
have smaller amplitude. Together these cause the loss of pulsatile release. However, on the
right, if th4 is made small enough for basal GH to stimulate somatostatin then pulsatile re-
lease returns

The model of the GH systern shows that only a few components, using very
simple dynamics, are sufficient to produce the pulsatile pattern of release. It is also
able to suggest what effects other components or connections might have on the
release pattern. It indicates that we need more experimental data in particular on
how GH affects somatostatin release, since this connections appear to play the
most important role in mediating the long delay between bursts of GH release. The
model provides better possible explanations for experimental results, such as the
non-linear increase in GH release with increasing stimulation of the GHRH neu-
rons. With the inhibitory connection from GHRH to somatostatin, stimulation of
GHRH increases GHRH release and also decreases somatostatin release, causing
the non-linear increase in GH release. This suggests an experiment repeating the
GHRH stimulation, with the addition of somatostatin antiserum to test whether the
non-linear increase in GH remains.
248 D. J. MacGregor, G. Leng, D. Brown

References

1. Sawchenko, P. E., Swanson, L. W., Rivier, 1., Vale, W. W. (1985). "The distribution
of GH releasing factor (GRF) immunoreactivity in the CNS of the rat: An immunohis-
tochemical study using antisera directed against rat hypothalamic GRF." Journal of
Comp. Neurology 237: 100-115.
2. Wiegand, S. J. and Price, J. L. (1980). "The ceUs of origin of the afferent fibres to the
median eminence of the rat." Journal ofComp. Neurology 192:1-19.
3. Dierickx, K. and Vandesande, F. (1979). "Immunocytochemicallocalization of soma-
tostatin containing neurons in the rat hypothalamus." Cell Tissue Research 201: 349-
359.
4. Aguila, M. C. and McCann, S. M. (1993). "GH increases somatostatin rele ase and
messenger ribonucleic acid levels in the rat hypothalamus." Brain Research 623: 89-
94.
5. Kamegai, J., Minami, S., Sugihara, H., Higuchi, H. and Wakabayashi, 1. (1994) "GH
induces expression of the c-fos gene on hypothalamic neurons in hypophysectomized
rats." Endocrinology 135: 2765-2771.
6. Epelbaum, J., Moyse, E., Tannenbaum, G. S., Kordon, C. and Beaudet, A. (1989).
"Combined autoradiographic and immunohistochemical evidence for an association of
somatostatin binding sites with GH-realising factor-containing nerve ceU bodies in the
rat arcuate nuc1eus." Journal of Endocrinology 1: 109-115.
7. Meister, B., Hokfelt, T., Vale, W. W. and Goldstein, M. (1985). "GH- releasing factor
(GRF) and dopamine coexist in hypothalamic arcuate neurons." Acta Physiologica
Scandinavica 124: 133-136.
8. Meister, B. and Hokfelt, T. (1988). "Peptide- and transmitter-containing neurons in the
mediobasal hypothalamus and their relation to the GABAergic systems: possible roles
in control of prolactin and GH secretion." Synapse 1: 585-605.
9. Meister, B., Hokfelt, T., Vale, W. W., Sawchenko, P. E., Swanson, L. W. and Gold-
stein, M. (1986). "Coexistence of tyrosine hydroxylase and GH- releasing factor in a
sub-population of tuberoinfundibular neurons of the rat." Neuroendocrinology 42:
237-247.
10. Wakabayashi, 1., Miyazawa, Y., Kanda, M., Miki, N, Demura, R., Demura, H. and
Shizume, K. (1977). "Stimulation of immunoreactive somatostatin rele ase from hypo-
thalamic synaptosomes by high K+ and dopamine." Endocrinology IJm. 24:601-605.
Il. Kitajima, N., Chihara, K., Abe, H., Okimura, Y., Fujii, Y., Sato, M., Shakutsui, S.,
Wantanabe, M. and Fugita, T. (1989). "Effects of dopamine on immunoreactive GH-
releasing factor and somatostatin secretion from rat hypothalamic slices in vitro." En-
docrinology 124: 69-76.
12. Locatelli, V., TorseUo, A., Redaelli, M., Ghigo, E., Massara, F. and Muller, E. E.
(1986). "Cholinergic agonist and antagonist drugs modulate the GH response to GH-
releasing hormone in the rat: evidence for mediation by somatostatin." Journal of En-
docrinology 111: 271-278.
13. Day, T. A. and Willoughby, J. O. (1980). "Noradrenergic afferents to the median emi-
nence: inhibitory role in rhythmic GH secretion." Brain Research 202: 335-345.
14. Amold M. A. and Femstrom, J. D. (1980). "Administration of antisomatostatin serum
to rats reverses the inhibition of pulsatile GH secretion produced by injection of me-
tergoline but not yohimbine." Neuroendocrinology 31: 194-199.
Modelling the GH Release System 249

15. Terry, L c., Crowley, W.R. and Johnson, M.D. (1982). "Regulation of episodic GH
secretion by the central epinephrine system." Journal of Clinical Invest. 69: 104-112.
16. Dickson, S. L., Leng, G. and Robinson, 1. (1993). "Growth-Hormone Release Evoked
By Electrical-Stimulation of the Arcuate Nucleus in Anaesthetized Male-Rats." Brain
Research 623(1): 95-100.
17. Dickson, S. L., Leng, G. and Robinson, 1. (1994). "Electrical-Stimulation of the Rat
Periventricular Nucleus Influences the Activity of Hypothalamic Arcuate Neurons."
Joumal of Neuroendocrinology 6(4): 359-367.
18. Wehrenberg, W. B., Brazeau, P., Luben, R., Bohlen, P. and Guillemin, R. (1982). "In-
hibition of the Pulsatile Secretion of Growth-Hormone By Monoclonal-Antibodies to
the Hypothalamic Growth-Hormone Releasing- Factor (Grf)." Endocrinology 111(6):
2147-2148.
19. Wehrenberg, W. B., Brazeau, P., Luben, R., Ling, N. and Guillemin, R. (1983). "A
Non-Invasive Functional Lesion of the Hypothalamo-Pituitary Axis For the Study of
GH-Releasing Factor." Neuroendocrinology 36(6): 489-491.
20. Ferland, L., Labrie, F., Jobin, M., Arimura, A. and Schally, A. V. (1976). "Physiologi-
cal role of somatostatin in the control of GH and thyrotropin secretion." Biochemical
& Biophysical Research Communications 68(1): 149-56.
21. Clark, R. G. and Robinson, 1. C. A. F. (1985a). "Effects of a Fragment of Human GH-
Releasing Factor in Normal and Little Mice." Journal of Endocrinology 106(1): 1-5.
22. Clark, R. G. and Robinson, 1. C. A. F. (l985b). "Growth-Hormone Responses to Mul-
tiple Injections of a Fragment of Human GH-Releasing Factor in Conscious Male and
Female Rats." Joumal ofEndocrinology 106(3): 281-289.
23. Clark, R. G. and Robinson, 1. C. A. F. (1987). "Growth-Hormone (GH) and Body-
Growth Responses to Intermittent Somatostatin (SS) Infusions in the Rat." Joumal of
Physiology-London 382: 33.
24. Clark, R. G., Carlsson, L. M. S., Rafferty, B. and Robinson, 1. C. A. F. (l988a). "The
Rebound Release of Growth-Hormone (GH) Following Somatostatin Infusion in Rats
Involves Hypothalamic GH-Releasing Factor Release." Joumal of Endocrinology
119(3): 397-404.
25. Clark, R. G., Carlsson, L. M. S. and Robinson, 1. C. A. F. (l988b). "Growth-Hormone
(GH) Secretion in the Conscious Rat - Negative Feedback of GH On Its Own Re-
lease." Joumal ofEndocrinology 119(2): 201-209.
26. Carlsson, L. M. S., Clark, R. G. and Robinson, 1. C. A. F. (1990). "Sex Difference in
Growth-Hormone Feedback in the Rat." Joumal of Endocrinology 126(1): 27-35.
27. Carlsson, L. M. S. and Jansson, 1. O. (1990). "Endogenous Growth-Hormone (GH)
Secretion in Male-Rats Is Synchronized to Pulsatile GH Infusions Given At 3-Hour In-
tervals." Endocrinology 126(1): 6-10.
28. Clark, R. G. and Robinson, 1. C. A. F. (1988). "Paradoxical growth-promoting effects
induced by pattemed infusions of somatostatin in female rats." Endocrinology 122:
2675-2682.
Hierarchies of Machines

M. Holcombe

Department of Computer Science, University of Sheffield,


Regent Court,Portobello Street, Sheffield, SI 4DP, UK
m.holcombe@dcs.shef.ac.uk

Abstract. Computational models have been of interest in biology for many years
and have represented a particular approach to trying to understand biologic al
processes and phenomena from a systems point of view. One of the most natural
and accessible computational models is the state machi ne. These come in a variety
of types and possess a variety of properties. This Chapter discusses some useful
ones and looks at how machines involving simpler machines can be used to build
plausible models of dynamic, reactive and developing biologic al systems which
exhibit hierarchical structures and behaviours.

1 Introduction: Computational Models

Much of the early work using state machines and related models for modelling
biologic al processes and systems was rather abstract and high level and probably
seemed, to many, to be of more philosophical than practical value. There have,
however, been some advances in the development of more realistic models and
the current state of computer science research provides us with new opportunities
both through the emergence of models that can model seriously complex systems
but also the support that modem software can give to the modelling process. This
chapter describes a few of the early simple models and then goes on to look at
some new ideas in the area. Some general principles relating to how new and
emerging computational techniques can help us to represent and understand ex-
tremely complex models conclude the paper.
Computational models are models of systems inspired by the model of an in-
formation processing system, in its most common manifestation a digital com-
puter but it is not as restrictive as that in practice.
The principle philosophy for this work is the representation of some aspects of
cellular metabolism as computation of suitably defined data. The main benefit is
the opportunity to make use of some of the theoretical models of general, concur-
rent computing systems in the hope that :
• the computational models may enlighten the biochemical theory
• the analysis of successful parallel biochemical systems will enlighten the the-
ory of parallel computing.
252 M. Holcombe

An early attempt to model metabolic processing was due to Krohn et al [1]


who usedfinite state automata to model basic metabolic reactions. Their principle
aim was to utilise a mathematical theorem which described how any such ma-
chine could be decomposed into a collection of very elementary machi nes, reset
machi nes and simple group machines. The use of a computer algorithm to help
with this decomposition was feasible and there was a hope that the information
would be useful in trying to understand the biochemistry.
The key issues are:
• Systems interacting with their environment (see Fig. 1.)
• Information processing models
• Models exhibiting communication and concurrency
• Models that might be amenable to automated analysis as weB as simulation.

extern al
...
..
inputs
... internal
states
system
outputs

Fig. 1. A simple system interacting with its environment

The most basic discrete model: finite state machine (see Fig. 2).
• a set of internal states
• a set of external inputs - events
• a set of system outputs - actions (this is an optional feature, in some machines
there are no explicit outputs)
• a transition structure to link it alI together
inputs are: a, b, c
outputs are: x, y
How does this model work? RecaB that the system is always in one of the in-
ternal states and state change is caused by the receipt of inputs.
• If in statel and input (event) a occurs then it changes to state2 and outputs x.
• If in state 1 and input b occurs then it changes to state3 and output y occurs.
Hierarchies of Machines 253

Fig. 2. A simple state machine

The system then waits for the next input to occur before the next state change.
This continues as long as the system continues to function. These models have
been used to analyse metabolic pathway mociels - [1] and we can iIIustrate this for
the simple Krebs (tricarboxylic acid) cycIe, Fig. 3.

Fig. 3 A model of the Krebs cycle

If we regard the inputs as being CI, C2 and C3 which are specific co-enzymes
that drive the cycIe and the intermediate substrates as being the states of the sys-
tem then it behaves Iike a state machine.
Enzymes that are required for each reaction are assumed to be present in suffi-
cient concentration.
This is a simple model of organisation which ignores many factors, such as re-
action kinetics, enzyme production, etc. Several quite complex metabolic path-
ways have been modelled in this way.
The algebraic theory of these machines can be used to construct decomposi-
tions into simple components (generated by finite simple groups - a theory that is
now well understood in mathematics). What the theory is saying is that any sys-
254 M. Holcombe

tem of this type can be broken up naturally into a collection of subsystems, manu-
factured from simple mathematical objects, which are joined together as systems
in two main ways - parallel and serial connections. For a full description of the
decomposition theory and its application to the Krebs cycle see Holcombe [2]
which is an extension of the holonomy decomposition theory of EiIenberg, [3]. In
the case of the Krebs cycle the algebraic structure of the machine can be decom-
posed to form an equivalent machine which is buiIt up from cyclic groups of or-
der 2 and 3 connected together as wreath products which are also combined with
some simple aperiodic semigroups of order 2 and 3.
What does this decomposition mean, biochemically? It does not seem immedi-
ately clear and for this reason research in this area seems to have faded away.
However, the impact of a greater understanding of the genomic basis for the pro-
duction of the enzymes to drive the system might throw new light on the problem.
Many systems can be modelled in this way but:
• It is unsuitable for complex systems because of state space explosion.
• The functions represented by these machines are too simple for many situa-
tions.
• Continuous behaviour is not modelled.
• Concurrent systems are not modelled well without large problems.
• Communication is hard to model this way.
Cellular automata, however, are built from these machines and have proved
useful in some cases, for example:
• Models of simple development
• Models of simple ecologies.
The idea of a computational model stems from the work of Turing, and others,
who, before digital computers were ever built wondered exactly what could be
computed by a hypothetical machine, and this led to the identification of a class of
generali sed state machines, Turing machines, which seemed to capture the notion
of computation. Turing machines are directly related to other approaches to the
definition of computation, recursive functions, Markov algorithms etc. and this
provided more evidence that the notion of computation had been defined in a
plausible manner. Recent work on the theory of quantum computers has taken this
work into a new direction.
The idea that a cell, a system of cells or a complete organism can be viewed as
carrying out some form of computation has been a significant factor in the appli-
cation of computational models to the modelling and simulation of biology. We
have seen that a particularly simple computational model, the finite state machine,
can be used to model aspects of a metabolic pathway and this also provides some
interesting connections between biology and computing.
Hierarchies of Machines 255

2 More Powerful Machines

The machines that we have considered are very simple and thus very limited in
their capabilities and do not represent a realistic basis for modelling biologic al
systems an processes. The Turing machines, on the other hand, are too abstract
and unwieldy to provide a suitable answer. We investigate some new model types
and see how they fare.

2.1 X-machines

A more sophisticated and powerful model can be introduced if we simply add an


internal memory and adjust the operation of the machine to match, see Fig. 4.
The system is in some state, an input a is received, the initial contents of the
memory are m and, depending on both a and m, the system changes state and
produces an output x and updates the memory to ro'.
This model is the stream X-machine [4] and is very general. It can model al-
most all computations and has been much studied in the last 5 years. The simple
device of allowing an arbitrary internal memory has opened up its capabilities in a
remarkable way. It is possible, using this, to model very complex systems that
would, otherwise, require state spaces with millions of states. The stream X-
machine model allows us to abstract these complexities and to wrap them up in
hierarchical structures that can be analysed much more easily.

outputs
internal
state
.. inputs

information retrieval and


.--r-_ update mechanism

Fig. 4. A conceptual model of a stream X-machine

Let us consider how we might use an X-machine to model a ceH or part of a


ceH by looking at the first example of a simple state machine modeHing the meta-
bolic processing of part of a ceH. One of the problems with this model was that
enzymes were not modeHed directly. We overcome this by using the memory as a
means of describing the concentration of various enzymes involved. In its sim-
plest form this consists of a vector of concentrations of the enzymes and the tran-
sitions between the states will only operate if a suitable threshold value of the re-
quired enzyme exists in the memory.
256 M. Holcombe

Let us consider a simple, idealised cellular architecture illustrated in Fig. 5.


The general information flows are indicated using the thick arrows and are meant
to be indicative of the sort of functional relationships that might apply in different
processing sites.

((jjITiTIil=~~~",--:- proces.ing
ite
(eg.mitocboodria)

nucleus_4+--~

proce-ssing si e

Fig. 5. A hypothetical ceH indicating active sites

The analogy with the VLSI model is strengthened when we consider some of
the sorts of processing that might be involved, for example we have simple meta-
bolic processes that might correspond to the behaviour of a register, that is infor-
mation (molecules) might be stored temporarily at a location prior to being further
transformed by another processing site, we might have sites performing fairly
simple processing akin to that of an adder which operates when its input places
contain appropriate values and then there are structures like multipliers that oper-
ate continuously. The overall model of the system could then be determined with
reference to the data types involving the various active sites and the functions op-
erating on the system described at this level.
The fundamental data type will be of the form:
x= INmsl x INms4 x [MSI) x [MS2) x [MS3) x[ MS4) x [N) x [PI) x [P2) x [P3) x
OUTms2 x OUTms3

where each element defines the type of values that the site can deal with and
would be defined in a formal manner following our previous work ([1,5]), INmsl
and INms4 are the input datatypes at msi and ms4 respectively and OUTms2 and
OUTms3 are the output data types at ms2 and ms3 respectively.
The next stage is to identify when the various sites are active and from this we
construct a top level model of the system using the specific processing functions
that determine what happens to the data.
In Fig. 6. we note the existence of functions indicated by the arrows, thus we
can list the functions and the appropriate data types. As a convention let us indi-
cate input datatypes and output datatypes to sites as follows :
For site K we define IN(K) and OUT(K) to be the appropriate types and
COMP(K) : IN(K) -+-> OUT(K) to be the specific processing function (which
Hierarchies of Machines 257

might be a partial function allowing for processing to be undefined under certain


data conditions - the processor then stops untiI the situation is resolved).
It should be remarked that, although discrete models were the inspiration for
this approach, we are making no assumptions about the type of the functions and
types being considered here. What we do need is some concept of the start and the
end of processing activity by each component site. This may raise issues, later, of
time and its role in these models. It is possible to augment the X-machine model
with time information but we will not deal with this added complication here.

Fig.6. A possible state space describing the functional dynarnics of the ceH

Retuming to our hypothetical ceH, we note that in some cases there are rela-
tionships between types, for example,
The environmental inputs and outputs are: INmsl, INms4, OUTms2, OUTms3.
where:

IN(ms3) = [P3] x [P2],


IN(P3) = [ms4] x [N],
IN(PI) = [N],
etc.
The component functions are described thus:

COMP(msl) : INmsl -+-> [msl],


COMP(ms2): [PI] -+-> OUTms2,
COMP(ms3) : [P2] x [P3] -+-> OUTms3,
COMP(ms4) : INms4 -+-> [ms4],
COMP(N): [msl] -+-> [N],
COMP(PI) : [N] -+-> [PI],
COMP(P2) : [N] -+-> [P2],
COMP(P3) : [ms4] x [N] -+-> [P3],
The system functions are aU defined on X and operate on the elements of X as
foHows:
258 M. Holcombe

FUNmsl : (il, - , a, - ,- , - ,- , - ,- , - ,- , - ) --> (il, - ,il, - , - , - , - , - , - , - , - , - );


FUNms4 : ( - , i2 , - , - , - , a , - , - , - , - , - , - , - ) --> ( - , i2 , - , - , - , i2 , - , - , - , - , - , - );
FUNN: (-, - , a , - , - , - , - , - , - , - , - , - ) --> ( - , - , - , - , - , - , b , - , - , - , - , - )
where COMP(N)( a) =b [N];
FUNPl : ( - , - , - , - , - , - , a , - , - , - , - , - ) --> ( - , - , - , b , - , - , - , - , - , - , - , - )
where COMP(Pl)( a) =b[Pl];
FUNP2: (-, - , - , - , - , - , a , - , - , - , - , - ) --> ( - , - , - , - , - , - , - , - , b , - , - , - )
where COMP(P2)( a) = b[P2];
FUNP3 : ( - , - , - , a , - , - , b , - , - , - , - , - ) --> ( - , - , - , - , - , - , - , - , - , c , - , - )
where COMP(P3)( a, b ) = c[P3];
FUNms2 : ( - , - , - , - , - , - , - , a, - , - , - , - ) --> ( - , - , - , - , - , - , - , - , - , - , b , - )
where COMP(ms2)( a) =b OUTms2;
FUNms3 : ( - , - , - , - , - , - , - , - , a , b , - , - ) --> ( - , - , - , - , - , - , - , - , - , - , - , c )
where COMP(ms3)( a, b ) =cOUTms3;
The processing is determined by what sites are active and what functions are
processing, we can postulate an abstract scenario for such a system in Fig. 6.
However, we are not c1aiming this to be a realistic example, it is just an indication
of the possibilities of the method. It would be possible, once the specific details of
the various functions were known, to simulate the situation which such a system
operating and to analyse various aspects of its behaviour.
The memory can be used to keep information about many aspects of the sys-
tem. It is likely to be constructed in an hierarchical way and this can be explained
by the following consideration. If the cell is a collection of different subsystems
organised in specific geographical regions then we can describe this as follows,
see Fig. 7. We use the notion of a membrane system. The memory describes key
factors that relate to the different membranes.
In the normal non-mitotic state each compartment, M" will have associated
with it a number of variables representing metabolic activity in that region. Equa-
tions describing how these variables change over time are defined. Some of the
activity will involve transfer across membranes and this will also be defined using
suitable equations. These equations will involve variables in both compartments,
one located within the other and the complete set of linked equations will describe
the overall behaviour of that part of the system. It can be represented by the use of
a nested list of variables which groups related variables together, so the memory
state of the left hand structure in Fig. 7 would be written as:

(mi' (m" (m4 )), (m2))


where each m, is a collection of variables from compartment m,.
Hierarchies of Machines 259

Fig. 7. A ceH with a coHection of subsystems contained in membranes

2.2 Communicating X-machi nes [6]

To model systems that operate concurrently and which communicate with each
other in a more efficient way we introduce a new approach. Consider a number of
separate stream X-machines, Fig. 8, which have the following properties:
1. There are certain communication channels between some of the machi nes.
2. Some of the states in these machi nes are solely used for communicating mes-
sages to other machines.
3. The other states are ordinary processing states.
Each machine works separately and concurrently in an asynchronous manner
until it reaches a communication state.
It then waits to either:
• send a message to another machine.
• or receive a message from another machine which may not be ready to send it.
The receiving machine must then wait for the message.

communicating state
..-.......-... _... _-_.-~-- - -_.-

channelA

machine 2

chaonelB

Fig. 8. A communicating X-machine system.


260 M. Holcombe

Once a machi ne has been involved in a communication event successfully it


can proceed with further internal processing until it reaches the next communica-
tion state. There are a number of slight variants on this model. AII, however, try to
model a type of asynchronous communication in a reasonably simple and intuitive
way. The message passing between individual machines is controlled by a matri x
which is illustrated in the following diagram, Fig. 9:
-
; _~G '- - - - - - __ .... mac hine 2 can
_- - - read from
! - .1 machine 1 at
machine(c;n write t
machi ne 2 by using • -\ \
this slot
this slot
"" - \
\
"
\
slots for communication
between machines 2 and 3
Fig. 9. The communication matri x

2.3 Hybrid Machines

AII of the machines discussed previously are discrete, so only instantaneous proc-
essing can be modelled and only finite discrete data are processed.
Continuous functions and real valued data cannot be incorporated into tradi-
tional finite state machine models.
The hybrid X-machine (or just hybrid machine) [7], overcomes this.
A hybrid machine has states and transitions as usual and responds to discrete
events and performs discrete actions which are observable. The internal memory
consists of:
• a set of discrete variable
• a set of continuous variables.
When it is in a given state there are sets of equations that apply to the system's
continuous variables and aII the while it is in that state, with time progressing,
these variables change according to these equations.
When either an appropriate external event occurs or a leaving condition is met
(e.g. a set point) the system moves to its next state where a different set of equa-
tions take over, see Fig. 10.
Hierarchies of Machines 261

event/action

eventlaction

Fig. 10. A hybrid machine

Some simple examples that can be modelled this way include:


• Ion flow through voltage gated channels
• Antigen-antibody interaction [7].
The continuous variables can exhibit complex behaviour, see Fig. 11 .
The equations are often composed of relatively simple functions compared to
the equations that try to describe the complete functions over ali states.

state1 state2 state3 state4


time

Fig. 11. The behaviour of a variable of a hybrid machine


262 M. Ho\combe

3 Agents and Agent Systems

Agents are autonomous systems that interact with their environment - and other
agents, using a set of rules to describe their behaviour.
They can be simple reactive systems or they can be very sophisticated with
complex strategies and learning capabilities. The important aspect of them is that
they are a low level phenomenon whose behaviour in their environment emerges
according to the rules. In a different environment the behaviour may change.
Agents can cooperate or compete with one another.
Many biologic al processes seem to behave like agents, an interesting example
being the system of ants and their behaviour as they form trails between the nest
and a food source, Fig. 12.

pheromone
trail ---Ij.

obstacle

ants oest
Fig. 12. A diagram of an insect trail

Suppose that these ants are simple agents that obey rules relating to the concen-
tration of pheromone and communication events with other ants.
Rules in order of priority might be.
1. If detect obstacle then change direction
2. If detect pheromone then follow trail
3. If meet ant with food then continue
4. If find food pick up, turn round and lay trail
5. If meet ant with no food then turn round
6. If true then move randomly
etc.
This agent can be described in full detail using an X-machine model, where
each function is defined by the rules and the associated inputs and memory. Mem-
Hierarchies of Machines 263

ory will be the position; whether with or without food; the direction the agent is
moving and other relevant parameters. See Fig. 13 for a state diagram of such an
agent. The details of the function definitions are left out in this review due to
space limitations.

move

change
direction

move

Fig. 13. An agent as an X-machine

This will be a hybrid machine. Whilst in the following traiI state the motion
could be determined by standard equations of motion. A refinement is to intro-
duce a probabilistic dimension to the possible state changes.
Other examples of agents can be found in the immune system, e.g. T cells, B
cells etc., different molecular species could also be interpreted in this way.
The next issue is how to model communities of agents. Suppose that we had a
collection of agents, in this case represented as individual hybrid X-machines. We
now have to try to identify the communication channels and how these might
work. Suppose that there are N agents and each is potentially able to communicate
with any other. We thus have an N x N communication matrix.
Suppose that an agent can only communicate with other agents within a spe-
cific distance of it. There is a global memory that maintains the current position of
each agent (an N-vector of coordinates). At any communication state an agent can
interrogate this memory to ascertain which other agents are within communication
distance. A number of strategies can be used to determine which and how many
attempts at a communication can be made. The agent then puts data into the ap-
propriate slots of the communication matrix and continues processing, moving or
whatever.
264 M. Holcombe

Some of these data may time out if the agent it is intended for fails to retrieve
it. Similarly, when an agent is within reach of another and wishes to do so it can
retrieve data from the specific slot. Agents can die in which case both the row and
column in the matrix become empty (void). Agents can be created, in which case
the matrix is expanded to an extra row and column and the memory extended as
appropriate.
Some simple ground rules must cover situations where two agents try to oc-
cupy the same co- ordinates. We should be able to simulate communities of sim-
ple agents in this way - when N becomes very large this may be a problem. Hope-
fully emergent properties can be identified, analytically or numerically. This
needs further investigation.

4 Hierarchies of Machines

Most biologic al systems exhibit hierarchy in a number of ways. Clearly there is


structural hierarchy with components within components within the cell etc.
There is also functional hierarchy with subsystems behaving in a variety of time
scales and a variety of levels. Hierarchical computational models can model some
of these issues.

4.1 Cellular Hierarchies

Suppose that we are looking at some specific metabolic processing within a cellu-
Iar component. This might be modelled using a simple state machine or an X-
machine. It requires the existence of a number of conditions to be realistic, for ex-
ample, certain enzymes will have to be present in sufficient concentrations at the
location or region of the processing. These are in turn controlled by the genes that
are currently expressed and this is dependent on many other factors including
metabolic activity in other parts of the system, intracellular messages and envi-
ronmental influences.
We could thus consider the following hierarchical sort of model for a simple
cell, Fig. 14:
Hierarchies of Machines 265

external signal
and events
I H~Hlll communication between levels

genetic
regulatio

communication between levels

ba sic metaboli
level

Fig. 14. A hierarchical model of a ceH.

Each level comprises a system consisting of communicating hybrid or discrete


X-machines. The communication within the level is handled by the existing
model. There is also communication between each level of the hierarchy and also
directly between any one level and any other.
The communication between levels is based on the inputs and outputs and
memory values that relate to the various levels. For example the outputs from one
level could be inputs to the level above or below. The synchronisation between
the levels may be a combination of data driven and timing based communication.
For example, suppose that the processing at the lower, metabolic level pro-
ceeds at a much faster rate than the processing at the genetic regulation level.
Thus most of the input-output communication may be directIy with a higher level,
perhaps at the level of the external events across the cell membrane. There will
then be a slower cycle of processing involving the intermediate genetic regulation
events which will provide "slow" changes to the enzyme availability and concen-
tration for the lower models. This will directIy impact the state of the memory for
the models in this lower level. Thus the processing at this level will be moderated
in an appropriate manner. The same can be said of other interIevel communica-
tions.
We are only at the start of using these sort of techniques in the modelling of
systems and it remains to be seen how effective they are . They may well be useful
in trying to clarify concepts and in the building of simulation software for such
266 M. Holcombe

systems. One important factor is what the overall system behaves like. Because it
has been proved that the X-machine model is fuHy general and that systems of
communicating X-machines are still X-machi nes it is plausible for us to treat the
model of the complete cell as an X-machine, albe it one composed of many levels
of intercommunicating systems of communicating X-machines.
One issue that has not yet been addressed is the fundamental fact that ceHs and
organisms do not remain structurally static but develop, divide, differentiate and
ultimately die. We will try to look at this through the issue of modelling tissue.

4.2 Tissue Hierarchies

If we have a model of a cell we need to see how it might fit within a model of the
collection of cells that form a coherent supersystem such as a tissue system.
The first point is that the cell model must reflect the fact that some cells change
their behaviour during their life. This can be modelled by the switching on of
parts of the hierarchy as a result of different events and internal factors. This can
be modelled by using the communication matri x and the memory to control what
parts of the system are active and which parts are inactive or faulty. In the latter
case we need to be able to intervene in the model and to 'damage' some transi-
tion.
For tissue we have another issue. We will start with the model of an individual
ceH and consider how tissues might be modelled. The obvious point is to use the
communicating X-machine approach but there is a problem in how we model the
growth of the tissue from a starting point of a single cell or group of cells. Mitosis
introduces a fundamentally new concept, that of cell division and the treatment of
this in the model needs to be addressed.
Again, the approach is to structure the memory. Here we have a memory that is
based round a single membrane system, together with, possibly other pieces of in-
formation, time, size and position being some possibilities. It each are associated
with a specific ceH then we can describe this as a vector of parameters, one of
which is a membrane system which itself has significant internal structure and
subdivisions. We call this the basic cel! memory structure.
The development of two cells from an existing cel! wil! take place in a particu-
lar state, a mitosis state, during this phase the structure of the memory will
change. The difference between the initial memory value on the entry to the mito-
sis state and the memory value on exit from this state will differ as fol!ows, Fig.
15:
A natural way of dealing with this is to construct a binary tree of the basic cell
memory structure, which is, in effect, the membrane structures and the other pa-
rameters associated with each cell, to represent the process of cel! division which
maintains the information about the individual cell
The state change describing the mitosis in Fig. 15 would ne represented in the
form of the transformation: C CI. C, where CI' C, represents the two new daugh-
Hierarchies of Machines 267

ter cells and the variable values involved in the original ceH C are distributed
amongst the two new ones. More work is needed to produce an effective model of
mitosis.

®O ~O
o
original ceH daughter ceH daughter ceH
Fig. 15. Mitosis

5 Conclusions and Further Work

One fundamental problem in modelling such complex systems as biologic al sys-


tems will be trying to understand the complex interactions between many subsys-
tems and the vastly complicated molecular and genetic activity that exists. We
might be able to build these models but will we be able to understand and analyse
them? It is likely that we will only be able to do this if we simplify them greatly.
As an alternative approach we developed the Hybrid Projection Temporal Logic
(HPTL) [7] specifically for hybrid machines. This logic allows us to define such a
machine in a precise formal logic which is the first step towards using automated
reasoning techniques. The basic process involves trying to establish properties
about the model, now represented as a logical formula in HPTL. There are two,
related, ways of doing this. First, we could try to prove theorems about the system
by using theorem proving engines, which is probably impracticable since the suc-
cess of automated theorem provers in dealing with extremely complex systems is
limited. An alternative approach is the use of model checking techniques, L8] ei-
ther alone or in combination with theorem proving. This is potentially feasible
and would allow us to ask 'what if' questions and query whether the system could
ever get into a state with a given property holding etc. This is more feasible since
model checking technology can handle models with very large state spaces. How-
ever, the technology needs to be substantially extended to cope with hybrid ma-
chines of this type. It does, however, offer a potentially rewarding direction for
research.
We could, in the mean time, use simulation, in virtuo (or in silico), to run these
models and derive some useful information about the system from it.
268 M. Holcombe

References

1. K. Krohn, R. Langer & J. Rhodes, Algebraic principles for the analysis of a biochemi-
cal system, J. Comp, and System Sci. 1,119-136,1967.
2. M. Holcombe, Algebraic Automata Theory, Cambridge University Press, Cambridge,
1982.
3. S. Eilenberg, Automata, languages and machines, volume, B, Academic Press, New
York, USA. 1976.
4. M. Holcombe & F. Ipate, Correct systems - building a business process solution,
Springer, Berlin Heidelberg New York, 1998.
5. N.J. Talbot, Trends in Microbiology, 9, 1995.
6. T. Balanescu, M. Holcombe, A. J. Cowling, H. Gheorgescu, M. Gheorghe, C. Vertan,
"Communicating Stream X-machines Systems are no more than X-machines". Journal
of Universal Computer Science, Volume 5, no. 9, 494-507,1999.
7. Z. Duan, M. Holcombe, A. BeII, A logic for biologic al systems, BioSystems, 55, 93-
105,2000.
8. E. Clarke, E. Emerson & A. Sistla, Automatic verification of finite state concurrent
systems using temporal logic specifications, ACM Trans. Prog., Lang., & Systems, 8,
244-263, 1986.
Models of Recombination in Ciliates

P. Sant
Department of Computer Science, King' s College, London WC2R 2LS, UK
sant@dcs.kcl.ac.uk

M.Amos
School of Biological Sciences and Engineering and Computer Science, University
of Exeter, EX4 4JH, UK

Abstract. Ciliate is a term applied to any member of a group of around 10,000 dif-
ferent types of single-celled organisms that are characterised by two unique fea-
tures: the possession of hair-like ci/ia for movement, and the presence of two nu-
clei instead of the usual one. One nuc1eus (the micronucleus) is used for sexual
exchange of DNA, and the other (the macronucleus) is responsible for cell growth
and proliferation. Crucially, the micronuc1eus contains an 'encoded' description of
the working macronuc1eus, which is decoded during development. This encoding
"scrambles" functional gene elements by both the permutation of coding sequences
and the inc1usion of non-coding sequences. A picture of the ciliate Oxytricha nova
is shown in Fig. 1. During development, ciliates reorganise the material in the mi-
cronuc1eus by removing non-coding sequences and placing coding sequences in
the correct order. This 'unscrambling' may be interpreted as a computational proc-
ess during which up to 95% of the original sequence is discarded. The exact
mechanism by which genes are unscrambled is not yet fully understood. We frrst
describe experimental observations that have at least suggested possible mecha-
nisms. We then describe two different models of the process. We conc1ude with a
discussion of the computational and biological implications of this work.

1 Introduction and Biological Background

The macronuc1eus consists of millions of short DNA molecules that are 'snipped
out' of or excised from the micronuc1eus. Each macronuc1ear molecule corre-
sponds to an individual gene, varying in size between 400 b.p. (base pairs) and
15,OOOb.p. (the average size is 2,000b.p.) The macronuc1ear DNA forms a very
small proportion of the micronuc1eus, as up to 95% of micronuc1ear DNA forms
intergenic 'spacers', and is eliminated during genetic excision (that is, only 5% of
the micronuc1eus is coding DNA).
270 P. Sant, M. Amos

Fig. 1. Oxtricha nova (picture courtesy of D.M. Prescott)

During macronuclear development, individual genes are excised from the mi-
cronucleus, and are, after the completion of this process, present as individual
short molecules in the macronucleus. The formation of the macronucleus triggers
the process of transcription, whereby genes are read and transcribed into RNA, the
'blueprint' for proteins.

1.1 IESs and MDSs

The process of decoding individual gene structures is therefore what interests us


here. In the simplest case, micronuclear sequences contain many short, non-coding
sequences called internal eliminated sequences, or IESs. These are short, AT-rich
sequences, and, as their name suggests, are removed from genes and destroyed
during the development of the macronucleus. They separate the micronuclear ver-
sion of a gene into macronuclear destined sequences, Of MDSs (Fig. 2a). When
IESs are removed, the MDSs making up a gene are 'glued together', or ligated to
form the functional macronuclear sequence. In the simplest case, lESs are bor-
dered on either side by pairs of direct repeat sequences of between 2 and 7b.p. in
the ends of the adjacent MDSs (Fig. 2b).
The removal of lESs is a two-stage process; first, the correct pair of repeat se-
quences bordering the lES must be identified, then the lES must be cut out and the
two MDSs ligated together. The molecular mechanisms by which these problems
are solved are still poorly understood. It is thought that, in some cases, staggered
cuts are made in the DNA to create 'sticky' versions of the repeats plus the ends of
the MDSs. The repeats then align and the MDSs are ligated (this process is known
as recombination). The excision of IESs and splicing together of MDSs can re-
quire up to several hundred thousand incredibly precise recombination operations
to be carried out over a time scale of a few hours [1].
Models of Recombination in Ciliates 271

MDSs i 2 3
~I____~~~__~rl~ ______~ (a)
IESi IES2

IESI

2-7 b.p.
Fig. 2. a Schematic representation of interruption of MDSs by IESs. b Repeat sequences
f1anking an IES

1.2 Scrambled Genes

In some organisms, the gene construction problem is complicated by the 'scram-


bling' of MDSs within a particular gene. In this situation, the correct arrangement
of MDSs is present in a permuted form in the micronuc1ear DNA. For example,
the actin I gene in Oxytricha nova is made up of nine MDSs and eight IESs, the
MDSs being present in the micronuc1eus in the order 3-4-6-5-7-9-2-1-8 [2].
During the development of the macronucleus, the MDSs making up this gene are
rearranged into the correct order at the same time as IES excision.
Scrambling is oftenfurther complicated by the fact that some MDSs may be in-
verted. lnverted MDSs are the reverse complement of the correct sequence (e.g.,
the inverse of the sequence COT is derived as follows; reverse the sequence, giv-
ing TOC, then take the Watson-Crick complement, giving a reverse complement
of ACO).

1.3 Fundamental Questions

Despite exhibiting these seemingly bizarre phenomena, ciliates are remarkably


successful organisms. The range of DNA manipulation and re-organization opera-
tions they perform has clearly been acquired during millenia of evolution. How-
ever, several fundamental questions remain: (1) what are the underlying molecular
mechanisms of gene reconstruction, and how did they evolve? (2) how do ciliates
'know' which sequences to remove and which to keep?
Concerning the first question, Prescott proposes [1] that the 'compression' of a
working nuc1eus from a larger predecessor is part of a strategy to produce a
'streamlined' nuc1eus in which 'every sequence counts' (i.e., useless DNA is not
present). This efficiency may be further enhanced by the dispersal of genes into
individual molecules, rather than their being joined into chromosomes.
272 P. Sant, M. Amos

In addition, the significance of the existence of MDSs and IESs is stilliargely a


mystery. Subdivision, excision and re-arrangement of their DNA appears to con-
vey no evolutionary advantage upon these organisms.
We may, perhaps, have more success in attempting to answer the second ques-
tion: how are genes successfully reassembled from an encoded version? In the rest
of this paper we address this question from a computational perspective, and de-
scribe two extant models that describe the rearrangement process.

2 Models of Gene Construction

We now present a review of two models that attempt to shed light on the process
of macronuclear gene assembly. In [3] Landweber and Karl propose an initial
model that was subsequent1y enhanced in [4], where a formal model for gene rear-
rangement was presented.
Landweber and Karl propose two main operations that model the process of in-
tra- and intermolecular recombination. These can be used to unscramble a micro-
nuclear gene to form a functional copy of the gene in the macronucleus. Both of
these operations are based on the concept of repeat sequences 'guiding' the un-
scrambling process.

Operation 1: {uxwxv} => {uxv, wx}


Operation 2: {uxv, wx} = > {uxwxv }
The first operation takes as input a single linear DNA strand containing two
copies of a repeat sequence x. The operation then 'folds' the linear strand into a
loop, thus aligning the copies of x. The operation then 'cuts' the strands after the
first copy of x and before the second copy of x, creating three strands (ux, wx and
v). The operation finallY recombines ux and v and wx forms a circular string. The
output of the operation is therefore a linear string uxv and a circular string wx.
This operation mimics the excision of an IES that occurs between two MDS's that
are in the correct (Le., unscrambled) order. In ciliates the IES is excised as a circu-
lar molecule and the two MDSs are 'sewn' together to make a single larger MDS.
The second operation takes as input a single linear strand and a separate circular
strand. The operation takes two inputs and creates a single linear strand. This al-
lows the insertion of the circular strand within the smaller linear strand and mimics
intermolecular recombination.
Intermolecular recombination takes as input a circular molecule and a linear
molecule, each of which contain a single copy of a direct repeat sequence x. The
result of intermolecular recombination is a single linear molecule containing two
copies of the repeat.
A subsequent model, due to Rozenberg et al. [5] builds on the work of Land-
weber and Karl. Repeat sequences are modelled as pointers, connecting one MDS
with another. In Fig. 3, the two MDSs x and z are separated by the IES y. The
Models of Recombination in Ciliates 273

pointers are represented by arrows, -+ being the outgoing pointer and +- being the
incoming pointer.
The frrst operation is the simplest, and is referred to as loop, direct-repeat exci-
sion. This operation deals with the situation depicted in Fig. 3, where two MDSs in
the correct (i.e., unscrambled) order are separated by an IES.
The operation proceeds as follows. The strand is folded into a loop with the
pointers aligned, and then staggered cuts are made (Fig. 3b and c). The pointers
connecting the MDSs then join them together, while the IES self-anneals to yield
two molecules, one linear and the other circular (Fig. 3d).
The second operation is known as hairpin, inverted repeat excision, and is
used in the situation containing inverted sequences (sequence y in Fig. 4a). The
molecule folds into a hairpin structure (Fig. 4b), cuts are made (Fig. 4c) and the
inverted sequence is re-inserted (Figure 4d), yielding a single molecule.
The third and final operation in the Rozenberg et al. model is double-loop,
alternating direct repeat excision/reinsertion. This operation is applicable in
situations where molecules have an alternating direct repeat pointer pattern, as
depicted in Fig. 5.

y Fi z (a)

x Fi z

y y y
(b) (e) (d)

Fig. 3. Excision
274 P. Sant, M. Amos

X
~ Y ~ z (a)

X
~
Y Y
z
~
(b) (e)

X
(d)
z
Fig. 4. Inversion

y z g u w
Fig.5. Alternating direct repeat pattern

The molecule folds into a double loop, with one loop being aligned by one
pointer pair, and the other loop being aligned with the second pointer pair (Fig.
6a). Cuts are then made «Fig. 6b), and the sections representing u and y exchange
positions by a process of reinsertion (Fig. 6c).
The three operations presented are intramolecular (as opposed to the intermo-
lecular operations of Landweber and Karl) in that a molecule 'reacts with' (i.e.,
folds on) itself. The process by which gene assembly takes place using these op-
erations and the computational properties of the system are discussed in detail in
[6].
Although the model presented may appear rather abstract, it has been success-
fully applied to the assembly of real genes, inc1uding the Actin 1 gene of Urostyla
grandis and Engelmanniella mobilis, the gene encoding a telomere binding pro-
tein in several stichotrich species and assembly of the gene encoding DNA poly-
merase a in Sterkiella nova. Descriptions of these applications are presented in [7].
Models of Recombination in Ciliates 275

X -X~

\
Y y

u) u

)
~//
w
~ (a)
w
~ (b)

g
=:~\
X

r: y ~/
) u)

w
(c)
Fig. 6. Excisionlinversion

3 Discussion

In this paper we have described two models for the assembly of genes in ciliates.
Although the fundamental molecular mechanisms underlying the operations within
these models are still not well understood, they do suggest possible areas of ex-
perimental enquiry. Looking further ahead, it may well be that in the future these
mechanisms may even be exploited by using ciliates as prototype cellular com-
puters.
276 P. Sant, M. Amos

References

D. M. Prescott. Invention and mystery in hypotrich DNA. Joumal of Eukaryotic Mi-


crobiology, 45(6): 575-581, 1998.
2 D.M. Prescott and A.F. Greslin. Scrambled actin I gene in the micronucleus of oxytri-
cha nova. Developmental Genetics, 13: 66---74, 1992.
3 L.F. Landweber and L. Karl. The evolution of cellular computing: Nature's solution to
a computational problem. BioSystems, 52(1-3): 3-13, 1999.
4 L. Karl and L. F. Landweber. Computational power of gene rearrangement. In Erik
Winfree and David K. Gifford, editors, Proceedings 5th DIMACS Workshop on DNA
Based Computers, held at the Massachusetts Institute of Technology, Cambridge, MA,
USA June /4 - June 15,1999, pages 207-216. American Mathematical Society, 1999.
5 A. Ehrenfeucht, 1. Petre, D. M. Precott, and G. Rozenberg. Universal and simple op-
erations for gene assembly in ciliates. In C. Martin-Vide and V. Mitrana, editors,
Where Mathematics, Computer Science, Linguistics and Biology Meet, pages 329-342.
Kluwer Academic Publishers, 2001.
6 A. Ehrenfeucht, 1. Petre, D.M. Prescott, and G. Rozenberg. Formal systems for gene
assembly in ciliates. Theoretical Computer Science, 2002 ..
7 D. M. Prescott and G. Rozenberg. Encrypted genes and their assembly in ciliates. In
Martyn Amos, editor, Cellular Computing. Oxford University Press, New York.
Developing Aigebraic Models of Protein
Signalling Agents

M. J. Fisher
School of Biological Sciences, University of Liverpool, L69 3BX, UK
fishennj@liv.ac.uk

G. Malcolm, R. C. Paton!
Department of Computer Science, University of Liverpool, Chadwick Building,
Peach Street, Liverpool L69 7ZF, UK

Abstract. This chapter considers a number of ways in which individual molecules


in protein signalling systems can be thought of as computational agents. We begin
with a general discussion of some of the ways proteins can be viewed from an in-
fonnation processing point of view. The degree of computational prowess shown
by many proteins, such as enzymes and transcription factors is discussed; specifi-
cally in tenns of a number of 'cognitive' capacities. We review some of the pro-
teins involved in signalling that make use of transfer of phosphate groups (kinases
and phosphatases) and focus attention on the Yeast MAP Kinase cascades. An al-
gebraic approach to modelling certain aspects of protein interactions is introduced.
We begin with a simple algebraic model which we describe in some depth, using
yeast signalling pathways as an example; we then describe techniques and tools
which promise more sophisticated models.

1 Proteins as Computational Agents

Knowledge about the subtle intricacies of molecular processes in cells and their
sub- compartments is increasing rapidly. Cells are highly structured, hierarchically
organised open systems. Contemporary models must take account of spatial het-
erogeneity, multi-functionaIity and individuality. For example, an enzyme can be
described as a 'smart thennodynarnic machine' which satisfies a 'gluing' (functo-
rial) role in the infonnation economy of the cell [1]. We exploit these views by
drawing comparisons between enzymes and verbs (see later and also elsewhere in
this volume). This text/dialogical metaphor also helps refine our view of proteins
as context-sensitive infonnation processing agents [2]. Many proteins, such as en-
zymes and transcription factors display a number of 'cognitive' capacities inc1ud-
ing: pattern recognition, handIing fuzzy data, memory capacity, multifunctionality,
signal amplification, integration and crosstalk, and context-sensitivity (see e.g., [3,
4].
In that they are able to interact with other molecules in subtle and varied ways,
we may say that many proteins display social abilities. This is c1early demon-
strated by Welch (e.g., [5, 6]). The social dimension to enzyme agency also pre-
278 M. J Fisher et al.

supposes proteins have an underlying ecology in that they interact with other
molecules including substrates, products, regulators, cytoskeleton, membranes,
water as well as local electric fields. The metaphor also facilitates a richer under-
standing of the information processing capacities of cells (e.g., [7]). This can be
characterised as an ability to act in a flexible, unscripted manner - another feature
of adaptive agents.
There are several ways of implementing computational agents in computer
software. We have investigated a number of approaches to this all from within an
individual based modelling (IBM) perspective (see [8, 9]). This type of approach
lends itself to fine grain simulation although there are many difficulties with the
variety, reliability and availability of experimental data. Some examples of this
work includes:
• Proteins as classifier systems - in particular, we examined a fuzzy genetics
based leaming classifier approach to 'designing' calmodulin-dependent protein
kinase [3].
• Proteins as logical agents - we demonstrated how 'simpler' enzymes such as
cyclin dependent kinase (Cdk) could be specified for implementation in logic
programs ([3]; see also [10]). Investigations of fuzzy logic models of aspects of
hepatic glucose metabolism were also undertaken [11].
• Proteins in SWARM-based systems - this was concemed with certain aspects
of the E. coli signal transduction system and involved a spatially explicit IBM
approach [12].
• Proteins and algebras - this work comes from programming semantics. Not
only do we argue that it is useful to view the signaling ecology as a vast parallel
distributed processing network of agents operating in heterogeneous microenvi-
ronments, we seek to develop mathematical and semantic methodologies that
might help clarify this analogy between biological and computational systems.
Some examples are discussed in a later section.
A related subject to proteins as algebras concems work on proteins as verbs.
This work grew from a simple analogy, that cells in some ways share systemic
properties that are like 'texts' [7]. Within this framework, enzymes (and particu-
larly signalling enzymes) play an integrative or 'gluing' or functorial role in the
cell [2]. Within this analogical framework it is possible to view multifunctional
processing capabilities of many proteins in terms of process or verb. For example,
CaM Kinase II is a large multimeric enzyme that acts on upwards of 50 substrates
and has four functional domains:
• Catalytic
• Regulatory (it has both inhibitory and CaM binding regions)
• Variable (for targetting and localisation)
• Association (with other subunits)
These functions can be expres sed in terms of verbs (e.g., targetting, catalysing,
regulating etc), or as nouns. Fig. 1 summarises the interactions of the four CaM
Kinase processes in a diagram in which processes are modelled as nodes (vertices)
and interactions as arcs (edges). Using the language of Category Theory it is pos-
Developing Algebraic Models of Protein Signalling Agents 279

sible to describe the patterns of interactions between these various processes in


terms of a colimit [13]. The pattern is a co\lection of cooperating objects. For ex-
ample, the internal organisation of a protein can by modelled by a pattern of do-
mains in which links represent chemical relations. A colimit (cohesive binding)
glues a pattern into a single unity in which the degrees of freedom of the parts are
constrained by the whole. The colimit models the integration of the pattern into a
single unity .

Fig. 1. Generation of the colimit of the CaM Kinase pattern

The patterns of activity of transcription fac tors provide further insights into
colimit or 'gluing' relations. For example, CBP/p300 are large multifunctional pro-
teins that participate in various basic cellular functions, including DNA repair, cell
growth, differentiation and apoptosis. They are important networking proteins in
that they act as foc al points for multiple protein-protein interactions and co-
activate many other transcription fac tors including CREB, nuclear receptors, sig-
nal transducer and activator of transcription (STAT) proteins, p53, and the bas al
transcription proteins. It is possible to talk about CBP/p300 acting like 'glue' in a
number of ways that relate to molecule-molecule bindings and interactions, en-
zymatic processes, as a physical bridge between various transcription factors, act-
ing as histone acetyltransferases (HATs) - linking transcription to chromatin re-
modelling, and mediating negative and positive crosstalk between different
signalling pathways.
This notion of 'gluing' is a very important concept for appreciating multifunc-
tional activities at a number of levels of intracellular scale. This 'glue' is not just
that the molecules have intrinsic adhesive properties, they also provide the cell
280 M. J Fisher et al.

with combinatorial and cohesive properties at a functionallevel. Object and proc-


ess can be subsumed as one general term 'glue' with respect to the multifunctional-
ity of these proteins. What is more, it is possible to introduce the topological
thinking of local ~ semi-Iocal ~ global into the context in which the 'glue' func-
tions.

2 Protein Information Processing Networks

Enzymes and transcription factors can be described as 'smart' materials in the


sense that they integrate chemical, thermodynamic and electrochemical signals in
a context sensitive manner. Indeed, examples like CaM Kinase II also exhibit a
memory capacity. The networks of interactions in a cell have similarities to the
multifarious interactions taking place across a wide-area distributed network of
computers [1, 2].
The information processing nature of eukaryotic intracellular signalling path-
ways illustrates these concepts well. Classically, a cell surface receptor binds an
extracellular effector (e.g. hormone, pheromone etc). Receptor occupation is
transduced into an intracellular signal by activation of an effector enzyme (e.g.
adenylate cyclase, phospholipase) responsible for synthesis of a secondary mes-
senger (e.g. cyclic AMP, diacylglycerol). The secondary messenger then promotes
activationlinactivation of protein kinases and/or protein phosphatases. Subsequent
changes in the phosphorylation state of target phosphoproteins (e.g. enzymes,
structural elements, transcription factors etc.) bring about the changes in cellular
activity observed in response to the external signal (see Fig. 2).
Intracellular signalling pathways alI share important information processing
features; for example, the generation of a secondary messenger and occurrence of
cascades, or hierarchies, of protein kinases permits a considerable degree of am-
plification to be introduced. The classical model for intracellular signalling there-
fore presents a highly sensitive mechanism for relaying small changes in the ex-
ternal environment to the interior of the cell. The system is highly flexible and
easily adapted to respond to a diverse range of primary messenger/receptor inter-
actions. A key feature of many signalling pathways, not initially apparent in this
model, is the ability to deliver receptor-derived information to unique sub-sets of
target phosphoproteins. For example, although many different hormones and ef-
fectors use a common cyclic AMP-based pathway to activate the cyclic AMP-
dependent protein kinase (PK-A), the consequences of PK-A activation can be
very different. Specificity is built into the signalling pathway through the activa-
tion of spatially distinct subsets of PK-A molecules. Spatial organisation of PK-A
is therefore a major regulatory mechanism for ensuring the selectivity and speci-
ficity of cyclic AMP-mediated responses.
Developing Aigebraic Models of Protein Signalling Agents 281

Exterior

Membrane

Interior

8 ~
+ ACTIVATION

B ..
PHOSPHORYLATION

@ ~ ! ACTIVATION

PHOSPHORYLATI~

®@8
The classical secondary messenger signalling system
Fig. 2. The c1assical secondary messenger signalling system

As indicated above, it is possible to consider the protein components of signal-


ling pathways to be 'smart thermodynamic machines' [4]. For example, the major
signal-sensitive protein kinases (PK-A, PK-C and calmodulin-dependent Kinase II
[CaMK)) are obviously all catalysts of phosphorylation. Additionally, they are all
'switches' that can be activated by the appropriate secondary messenger (cyclic
AMPIPK-A; diacylglycerolJPK-C; Ca2+/CaMK). Specific isoforms of these en-
zymes may also be subject to autophosphorylation. Phosphorylation of the Rn iso-
form of the PK-A regulatory subunit prolongs the dissociated, activated state of
PK-A [14]. Similarly, CaMK carries out autophosphorylation of an inhibitory do-
main, thereby prolonging the activated state of the enzyme [15]. As a conse-
quence, protein kinases can be considered to have a capacity for memory, i.e. even
when secondary messenger signals have diminished, their phosphorylating power
is preserved. Protein kinases and protein phosphatases may also possess 'posi-
tional' or 'targeting' information [16]. For example, isoforms of the PK-A cata-
lytic subunit can be modified by the addition of a myristoyl group. This fatty acid-
derived group may direct PK-A to specific membrane locations, or altematively,
into specific protein-protein interactions. The spatial organisation of PK-A also
282 M. J Fisher et al.

depends upon its association with structural elements of the ceH via anchor pro-
teins (AKAPs - A kinase anchor proteins; [17]). Specific isoforms of CaMK also
possess 'positional' information; in that 'nuclear-specific localisation' sequences
target this enzyme to the ceH nucleus and, consequently, a role in the phosphoryla-
tion ofproteins involved in the control of gene expression [18].
Perhaps the most sophisticated example of spatial organisation of signalling
pathway components concems the mitogen-activated protein kinase cascades
(MAPK cascades). The spatial organisation of these protein kinase cascades leads
to distinct cellular responses to a diverse group of environmental [19]. MAPK
cascades are organised into discrete parallel signalling complexes by 'scaffold
proteins' - the signalling cascade components are brought into close physical con-
tact allowing rapid and direct transfer of signalling information (see Fig. 3). An in-
triguing feature of these signalling pathways is that, despite sharing common
components, they are normally extremely weH insulated from each other and show
Httle if any 'cross-talk or cross-activation' [20].
Mating Pheromone

!
·". . 1-·. .· Solute change.

~
Cell membrane

+
+ Q +
+
e
+
8 +
+ ?

+ +
'osmoadaptation' gene.
'mating' gene.
'filamentation' gene.

Fig. 3. Parallel MAP kinase cascades in yeast

3 Towards an Aigebraic Model of Protein Interactions

In this section we sketch an algebraic approach to modelling protein interactions.


We begin with a simple algebraic model which we describe in some depth, using
yeast signalling pathways as an example; we then describe techniques and tools
which promise more sophisticated models.
Developing Algebraic Models of Protein Signalling Agents 283

As a first and somewhat na'ive approach to modelling protein interactions alge-


braicalIy, we represent an individual protein as a term such as Stell(n), where
Stell is the name of the protein, in this case the MAPKKK protein Stell, and n
represents the number of phosphate groups attached to the protein. For example,
Stell(O) represents an unphosphorylated Stell protein, and Ste7(1) a phosphory-
lated Ste7 protein with one phosphate group. We are not concemed here with how
many phosphate groups may be attached to a protein, nor with where these groups
may be attached; for our present purposes it is sufficient to distinguish between
phosphorylated and unphosphorylated proteins.
In order to allow proteins to interact, we need to model the space that proteins
inhabit. A tirst and fairly unstructured approach to this space is to represent it as a
term of the form

where each subterm Pj represents a protein. The operator EB is associative and


commutative, which can be thought of as allowing proteins to 'swim pasC one an-
other in an 'associative--commutative soup' (more formally, we work with bags of
proteins, or the free Abelian monoid over the terms Pj). For example, a space with
two Stell proteins and one Fus3 protein, aII of them unphosphorylated, is repre-
sented by:

Ste11 (O) EB Ste11 (O) EB Fus3(0)


which, because EB is associative and commutative, is equal to:

Ste11 (O) EB Fus3(O) EB Ste11 (O)


In general, a term such as that above is equal to alI of its permutations, and the
EB symbol is simply used to indicate that some number of proteins inhabit the same
space, and can therefore interact with one another. We refer to such terms as pro-
tein soup terms.
The form of protein interaction we are concemed with here is phosphorylation,
particularly in the yeast signalling pathways discussed above. We model protein
interactions by rewrite rules that transform protein soup terms. For example, the
phosphorylation of an Ste7 protein by a phsophorylated Ste Il protein can be rep-
resented by the following rewrite rule:

Ste11 (n) EB Ste7(m) ---}o Ste11 (n) EB Ste7(m +1)


The left-hand side of this rule represents a phosphorylated Ste Il protein (we
assume that n is greater than O) together with an unphosphorylated Ste7 protein.
The right-hand side represents the phosphorylated Ste7 protein, which has gained
one phosphate group. The phosphate group is provided by ATP, which we assume
is in plentiful supply, so that we do not need to explicitly add ATP molecules to
our protein soup terms (though it would be a simple maUer to do so).
284 M. J Fisher et al.

Note that both proteins coexist in both the left- and right-hand sides of this rule;
alI that changes in this transformation is the phosphory!ation of the proteins. As a
concrete example of applying this rule, given the protein soup term

Fus3(1) E8 Stell(2) E8 Ste7(O) E8 SteI2(O)


we can apply this rule to obtain:

Fus3(1) E8 8te11(2) E8 8te7(1) E8 8te12(0)


(Here, we take m=O and n=2, i.e., the phosphorylated Stell protein has two
phosphate groups, and the Ste7 protein goes from an unphosporylated state to a
state in which it has one phosphate group.) Because E8 is associative and commu-
tative, the rule is more generally applicable in any protein soup term in which
there is both a phosphorylated Stell protein and an unphosphorylated Ste7 pro-
tein; for example,

Fus3(1) E8 8te7(0) E8 8te7(1) E8 8te12(0) E8 8te11 (2)


can be rewritten as:

Fus3(1) E8 8te7(1) E8 8te7(1) E8 8te12(0) E8 8te11(2)


We model phosphatase action through rewrite rules of the following form:

8te11(n) ~ 8te11(n-1)
Again, we assume that n is greater than O. We as sume we have similar phos-
phatase rules for each of the proteins we are interested in.
The various interactions relevant to the yeast mating pathway of Fig. 3 are
given by the three rewrite rules below. The first two represent the cascade of acti-
vation facilitated by the Ste5 'scaffold' protein, where SteI! activates Ste7, which
in turn activates Fus3. First, the activation of Ste7 by Stel!, in the presence of an
Ste5 protein:

8te5(p) E8 8te11 (n) E8 8te7(m) ~ 8te5(p) E8 8te11 (n) E8 8te7(m+1)


Second, the activation of Fus3 by Ste7, again in the presence of the Ste5 pro-
tein:
8te5(p) E8 8te7(n) E8 Fus3(m) ~ 8te5(p) E8 8te7(n) E8 Fus3(m+1)
Third, phosphorylation of Ste12 by Fus3, which does not require the mediation
of Ste5:

Fus3(n) E8 8te12(m) ~ Fus3(n) E8 8te12(m+1)


Developing Algebraic Models of Protein Signalling Agents 285

We have broken down the effects of the scaffold protein into two stages (the
first two rules above) for the purpose of illustration only. There is some evidence
that the effect of the Ste5 scaffold protein is processive: that is, both phosphoryla-
tion reactions occur at the same time [21].
As an example, once the mating pathway is activated, we may see the following
happen:

Ste5(1) E9 Ste11(1) E9 Ste7(O) EE> Ste11(1) EE> Fus3(O) EE> Ste12(O) EE> Ste12(O)

Ste5(1) EE> Ste11 (1 ) EE> Ste7(1) EE> Ste11 (1 ) EE> Fus3(O) EE> Ste12(O) EE> Ste12(O)

Ste5(1) EE> Ste11 (1) EE> Ste7(1) EE> Ste11 (1) EE> Fus3(1) E9 Ste12(O) EE> Ste12(O)

Ste5(1) EE> Ste11 (1) EE> Ste7(1) EE> Ste11 (1) EE> Fus3(1) EE> Ste12(1) EE> Ste12(O)

Ste5(1) EE> Ste11(1) EE> Ste7(1) EE> Ste11(1) EB Fus3(1) EB Ste12(1) EB Ste12(1)

Ste5(1) EB Ste11 (1) EE> Ste7(1) EB Ste11 (O) EB Fus3(1) EB Ste12(1) EB Ste12(1)
At each step we highlight in bold-face the proteins that match the left-hand side
of a rewrite rule. The end re suIt is that both the SteI2 proteins are phosphorylated,
while in the last rewrite one of the SteII proteins loses a phosphate group through
phosphatase action.
This algebraic model has the virtues that it faithfully represents the signalling
pathway cascade of the yeast mating response, and that it does so computationally:
the rewrite rules described above constitute a program in an algebraic program-
ming or specification language such as OBJ [22] or Maude [23], from which we
borro\V the term 'associative-commutative soup'). This computational element al-
lows us to construct simulations of protein interactions as OBJ or Maude pro-
grams. A disadvantage of our model, however, is its simplicity: it oversimplifies
the protein interactions involved in the signalling pathways in several ways. One
of these ways involves the effects of the scaffold proteins. According to the re-
write rules given above, the interactions between SteII, Ste7 and Fus3 take place
in any space which simply contains an Ste5 protein; there is no notion of the pro-
teins binding in any way to the 'scaffold'.
We could characterise the interactions between proteins and scaffold proteins
by introducing terms such as:

Ste5(p: Ste11 (n) , Ste7(m) )


286 M. J Fisher et al.

which is intended to represent an Ste5 protein which has bound an Stell and an
Ste7 protein. The rewrite rule capturing the phosphoryIation of Ste7 by Ste 11 then
becomes:

Ste5(p: Ste11 (n) , Ste7(m)) ~ Ste5(p: Ste11 (n) , Ste7(m+1) )


Note that we would also require rules describing how SteII and Ste7 proteins
become bound and unbound, as weB as another form of term representing the
binding of Fus3 proteins to the scaffold. This would require considerabIy more
work in the way of designing appropriate term structures and rewrite rules, but
adds nothing essentially new to the simple approach described above.
The new terms introduced to represent the binding of proteins to scaffold pro-
teins, however, are another way in which our model oversimplifies the interactions
of the pathway proteins: it does not take account of the spatial organisation of pro-
teins, neither at the le veI of interactions between proteins or at the level of the spa-
tial structure of individual proteins in terms of their domains and phosphorylation
sites (an investigation of the roles of domains and sites is reported in [24]). One
possible approach that would take account of spatial organisation would be to use
graph rewriting as a generalisation of term-rewriting [25], whereby the interac-
tions between proteins and scaffold could be represented as graph structures rather
than terms. Yet again, another possibility, one that we propose rather more specu-
lativeIy, is to consider protein-scaffold interactions (and other complexes) as 'free
algebraic structures' formed from underlying individual proteins. The algebraic
methodology for such an approach would lie in the Categorical Systems Theory
developed by [26] and applied to software systems in [27]. This approach is topo-
logical in the sense that the behaviour of systems is captured by limit construc-
tions that allow global behaviour to emerge from local behaviours. An application
ofthese ideas to the semantics of object systems is presented in [28]. As to the
specification of the 'free algebraic structure' of interacting proteins, a promising
approach is the nested sketches of [29], based on the very abstract category theo-
retic notion of Kan extensions.

References

Paton, RC. & Matsuno, K. (1998), Some Common Themes for Enzymes and Verbs,
Acta Biotheoretica, 46, 131-140.
2 Paton, Re. (1997), Glue, Verb and Text Metaphors in Biology, Acta Biotheoretica,
45,1-15.
3 Paton, RC., Staniford, G. & Kendall, G. (1996), Specifying Logica! Agents in Cellu-
Iar Hierarchies, in Cuthbertson, R., Holcombe, M. & Paton, R. (eds), Computation in
Cellular and Molecular Biological Systems, World Scientific: Singapore, 105- 119.
4 Fisher, M.J., Paton, Re. & Matsuno, K. (1999), lntracellular Signalling Proteins as
'Smart' Agents in Parallel Distributed Processes, BioSystems, 50, 159-171.
5 Welch, G. R. (1987), "The Living Cell as an Ecosystem: hierarchical analogy and
symmetry", Trends Ecol. Evol., 2, 305-309.
Developing Aigebraic Models of Protein Signalling Agents 287

6 Welch, G. R. & Keleti, T. (1987), "Is cell metabolism controlled by a molecular de-
mocracy or a supramolecular socialism?", TIBS, 12, 216-217.
7 Paton, R.e. (1993), Some Computational Models at the Cellular Level, BioSystems,
29,63-75.
8 Devine, P. & Paton, R.C. (1997a), Biologically-inspired Computational Ecologies: a
Case Study, AISB Evolutionary Computation Workshop, appears in Corne D &
Shapiro, J. (eds) LNCS Springer, Berlin Heidelberg New York.
9 Devine, P. & Paton, R.e. (1997b), Individual Based Modelling in an Explicitly Spa-
tio-temporal Ecosystem, Proceedings of IMACS 97 World Congress, Berlin.
10 Staniford, G. & Paton, R.e. (1994) Simulating Animal Societies using Distributed
Agents, Proceedings of ECAI Workshop on Distributed A.I., also appears in
Wooldridge, M.J. & Jennings, N.R. (eds) Intelligent Agents, Lecture Notes in AI 890,
Springer: Berlin, 145-159.
II Butler, M.H. (1999), Information Processing in Liver Glucose Metabolism, Unpub-
lished PhD Thesis, The University of Liverpool.
12 Clark, L. (2000), A Distributed Information Processing Model of Bacterial Chemo-
taxis, Unpublished PhD Thesis, The University of Liverpool.
13 Ehresmann, A.C. & Vanbremeersch, J-P. (1987), Hierarchical evolutive systems ... ,
Bul. Math. Bio. 49 [1]: 13-50.
14 Takio, K., Smith, S. B., Krebs, E. G., Walsh, K. A. & Titani, K. (1984) Amino acid
sequence of the regulatory subunit of bovine type II adenosine cyclic 3' ,5' -phosphate
dependent protein kinase. Biochem. 23,4200-4206.
15 Schulman, H. & Lou, L. L. (1989) Multifunctional Ca2+/calmodulin-dependent pro-
tein kinase: domain structure and regulation. Trends Biochem. Sci. 14,62-66.
16 Hubbard, M. & Cohen, P. (1993) On target with a mechanism for reversible phos-
phorylation. Trends Biochem. Sci. 18, 172-177.
17 Rubin, C. S. (1994) A kinase anchor proteins and the intracellular targeting of signals
carried by camp. Biochim. Biophys. Acta 1224,467-479.
18 Heist, E. K. & Schulman, H. (1998) The role of Ca'+/calmodulin-dependent protein
kinases within the nucleus. Cell Calcium 23, 103-114.
19 Sprague, G. F. (1998) Control of MAP Kinase signalling specificity or how not to go
HOG wild. Genes & Dev. 12,2817-2820.
20 Fisher, M., Malcolm,G & Paton, R.C. (2000), Spatio-Iogical Processes in Intracellular
Signalling, BioSystems, 55, 83-92.
21 Levchenko, A., J. B. and P. W. Sternberg. Scaffold proteins may biphasically affect
the levels of mitogen-activated protein kinase signaling and reduce its threshold prop-
erties. PNAS voI. 97, no. 11, pp. 5818-5823, May 2000.
22 Goguen,1. A., T. Winkler, J. Meseguer, K. Futatsugi and 1.-P. Jouanaud. Introducing
OBJ3. In Joseph A. Goguen and Grant Malcolm, eds., Software Engineering with
OBJ: Algebraic Specijication in Action. Kluwer, Dordrecht, 2000.
23 Clavel, M., S. Eker, P. Lincoln and J. Meseguer. Principles of Maude. Electronic
Notes in Theoretical Computer Science, voI. 4, pp. 65-89. Eisevier Amsterdam, 1996.
24 Elion, E. E., Pheromone response, mating and ceH biology. Current Opinion in Mi-
crobiology, voI. 3, pp. 573-581. Elsevier Science, Amsterdam, 2000.
25 Courcelle, B., Graph rewriting: an algebraic and logic approach. In Jan van Leeuwen,
ed., Formal Models and Semantics, voI. B of Handbook of Theoretical Computer Sci-
ence, Eisevier Amsterdam, 1994.
26 Goguen, J., Objects. International Journal of General Systems, voI. 1. pp. 237-243,
1975.
288 M. J Fisher et al.

27 Malcolm, G., Interconnections of Object Systems. In S. Goldsack and S. Kent, eds.,


Formal Methods and Object Technology. Springer Workshops in Computing,
Springer-Verlag, Berlin Heidelberg New York, 1996.
28 Malcolm, G. and Cirstea C., Distributed operational semantics for the object para-
digm. In Working Papers of the International Workshop on Information Systems -
Correctness and Reusability, 1995.
29 Reichel, Ho., Nested sketches. Technical report ECS-LFCS-98-401, University of Ed-
inburgh, 1998.
Categorical Language and Hierarchical Models
for CeH Systems

R.Brown, R.Patont, and T.Porter

R.Brown, Mathematics Division, School of Informatics, University of Wales,


Bangor, Gwynedd LL57 lUT, UK.
r.brown~bangor.ac.uk

R.Patont, Department of Computer Science, The University of Liverpool,


Liverpool L69 3BX, UK.

T.Porter, Mathematics Division, School of Informatics, University of Wales,


Bangor, Gwynedd LL57 lUT, UK.

Abstract

The aim is to explain and explore some of the current ideas from category
theory that enable various mathematical descriptions of hierarchical struc-
tures. We review some aspects of the history and motivations behind the
development of category theory and how it has impacted on developments
in theoretical biology and theoretical computer science. This leads on to a
discussion of hierarchical systems and a discussion of some simple examples.
The important idea of colimit is then introduced. Towards the end of the
chapter a number of open questions and problems are discussed.

1 Introduction

This chapter seeks to explore some of the current ideas that provide abstract
models for hierarchical systems in general and ceH systems in particular. The
models are based on category theory and as that theory is to a large extent
relatively unknown to workers in (mathematical and computational) biology,
the chapter will introduce some of the elementary language and concepts
of that theory. We will attempt to do this through fairly 'common or gar-
den' mathematical situations which themselves have aspects of hierarchical
structures about them. Our aim is to present enough of the ideas to make
it possible for the enterprising reader to start delving further into the way
in which others (Rosen [1], Ehresmann and Vanbremersch, see for instance,
[2)) have seen category theory as a potentiaHy usefullanguage and toolkit of
concepts for use in this area.
290 R.Brown, R.Paton, and T.Portcr

Discrete or network models for complex systems are commonplace in the


literature. Such models can be enriched using simple concepts from category
theory. Essentially the additional fcatun~ is to consider not just links between
'nodes' but also sequences of 'composable' links. 1'his adds just enough alge-
braic structure into the combinatorial network modd to allow a whole range
of useful new ideas to become available. Certain f(~atures can be investigated
using a 'toy model' which does not as sume anything more than basic arith-
metic, yet shows IIp some of the hidden assumptiolls in more complex models.
Of course a 'toy model' callnot reveal deep structure, it is little more than a
thought experiment, so our aims are deliberately limited.

2 Category Theory : History and Motivation

1'he first paper in category theory was by Eilenberg and Mac Lane in 1945
[3]. It airned to describe (i) interaction and comparison within a giWIl con-
text (topological spaccs, groups, other algebraic structures, etc.) and (ii) in-
teractions between different contexts, for instance within the an~a of pun'
mathematics known as algebraic topology, problf'rns in the theory of spaces
arc attackcd by assigning various types of algebraic gadgetry to spaces, thllS
translating the topological problern to a, hopdully more tractable, algebraic
one. A category consists of objects anei 'rnorphisms' between thern mor-
phisrns can be thought of as 'structure preserving mappings between objects'.
Even as early as the 1950s anel 1960s, category theory hael bccome a highly
successful language providing geJl(~ral tools for revealing common structure
between different contexts. New concepts (limits, colimits, products, coprod-
uds, etc.) were defined abstractly and these elefinitions highlighteel the prop-
erties that had beeI! there in the examples but had often been hidden by
context specific details. Already in la58 ROSCIl had tried USillg elementary
categorical language to help with the modelling of biological systems. In t.he
1960s, Lawvere's thesis and relatecl work by llenaboll and others at abOlit the
salIle time, sheclnew light on what it Illeant for el structure to be 'algebraie'.
This linked up with thc semantics of formal languagcs and thus with logic.
Lambek anei Lawvere (1968-1970s) showed that there was an int(~rpretation
of the typed ).-caIculus, which was a rich model for some aspects of the logic al
theory of sets, within category theory. Here formulae are interpreted as the
objects, anei proofs as the morphisms. (For ilS OIle important point is that
the morphisms are no longer structure preserving 'mappillgs', they are just
'rnorphisIlls'!) 1'his gave:
(a) Models for set tlwory, but sonwthing much richer is true, the 'sets' are
'variable sets'; anei they can be applied in many more c:ontexts and have their
Categorical Language and Hierarchical Models for Cel! Systems 291

own internal categorical logic, and


(b) Links to the newly emerging discipline of theoretical computer science.
(For instance the semantics of programming languages has been greatly en-
riched by the work of Larnbek and Scott on Cartesian Closed Categories [4],
Scott, Plotkin and Smyth on domain theory, using categorical and topological
insights, and Arbib and Manes [5] using partially additive categories.)
More recently Girard (1987) has introduced linear logic. This is a 'resource
sensitive logic' or 'logic of actions' and is, once again, firmly based on a
categorical semantics. The correspondence here is more nearly 'formulas' with
'states', and 'proofs' with 'transitions'. Within control engineering for some
time a network model (Petri nets) had been used as an analytic tool for
developing concurrent control systems. In 1991, Meseguer and Montanari
linked Petri nets, and thus concurrency, with models of linear logic. This
linked 'states' with 'objects' and 'transitions' with 'morphisms'. This work
of Meseguer led onto work on a rewriting language, MAUDE, providing a
semantics of object systems and so a 'logical theory of concurrent objects'
[5]. (The links of this work with the modelling of biological systems is briefly
explored in [7].)
For us these developments present the seductive suggestion that by adapt-
ing and extending some hypothetical model of (part of) a cell system one
might get not only a model but also some corresponding logical framework,
possibly even a rich programming language based on that logical language.
That is a long way in the future. Suffice it to say that Petri nets are also
widely used in modelling manufacturing systems, and they exist in fuzzy,
stochastic and timed variants, but as yet no categorical description exists for
these rich variants. Can one hope for a Petri net/ linear logic like descrip-
tion of some of the metabolic systems in the body? This framework might
yield only 'toy models' but these could stiH be useful as pointing to a richer
language of processes and processing likely to be required for modelling bio-
logical systems.
With these hints of what might be possible in our wildest dreams, let us
summarise some of the characteristics of a formal language in a categorical
context. (i) It will be many sorted with a collection of states/ objects. (ii) Be-
tween the states there wiH be transitions Of morphisms that may be thought
of a 'proofs'. The constructions within the logic will give us new constructions
of objects from collections of old ones, so we might have a category with some
cxtra strudure - but what strudure?
292 R.Brown, R.Paton, and T.Porter

3 Categorical Models for Hierarchical Systems

Later we will be looking at some aspects of the theory of hierarchical systems


as developed by Ehresmann and Vanbremersch (1985 to the present), but
to start with we should examine the motivation for such a theory. The idea
is that a categorical formal language which is rich enough to describe and
allow analysis of aspects of complex hierarchical systems should be applicablc
to cell systems and to other situations such as manufacturing systems. We
start with a categorical model of a hierarchical system and analyse its logic al
structure both for its own sake and from the modelling aspect. An ongoing
research issue that emerges from this is whether such models are sufficiently
robust when applied to experimental situations.
Our aim is to convey ideas and intuitions rather than give full details of
the mathematics. 80 we begin our introduction to category theoretic thinking
with a simple illustrative example of an hierarchical system that assumes very
bttle mathematical background on behalf of the re ader.
The divisors of 45 form, in the first instance, a set

{1,3,5,9,15,45}

which we write as Div(45). Of course there are also relationships between


these divison:;: i is related to j if i exactly divides j. This leads to a diagram
of the structure:

Levcl

45 3

15
/~ 9 2

;)
/~/ 3 1

~/
1 O
This is an example of a partially ordered set (poset). The above diagrarn is
called the Hassc diagmm of the poset. ~ote that only the essential 'gener-
ating' arrows are shown here. A poset is not simply a bag containing parts
there is also an ordering among the parts. If we include the 'composite'
arrows, we get a more complicated diagram.
Categorical Language and Hierarchical Modcls for Cel! Systems 293

Level

9 2

1 o
togethcr with a loop at each vertex i sin ce i exactly divides itself! Each level
of the 'hierarchy' measures in some sense the 'complexity' of the objects at
that level.

In going from simple diagrarn of the partially ordered set (poset) to that
including the composite arrows we are starting the transition from posets to
categories. As we do so, further structure is added to the system. What is a
category in this sense?
A category COIlsists of:
- A coUection of objects C = {i, j, k, ... , etc.}.
Collections of ar·row8 or links or morphisms (alternative terms depending
on taste, context etc.).
Each arrow has a 80urce ami target and C(i,j) wiU denote the set of arrows
from source -i to target j, if f E C(i,j) we might also write f : i ---+ j Of
. f .
~---+J.

Composition of arrows, when this makes sense

C(i,j) x C(j, k) ---+ C(i, k)

. ---+
f ·J, 9, k
J. --,
l

- Associativity : (fg)h = f(gh) if cither side makes sense.


Identities : for each object i, there is a special 'identity arrow' 1; E C(i, oi)
such that if f : i ---+ j, then 1;1 = f = f1j.
Examples
These wiU be based on the set Div( 45) of divisors of 45. In each case we
wiU give the objects anei then for objects i, j we wiU giw the set of arrows
from i to j.
294 R.Brown, R.Paton, and T.Porter

1. Do Objects - divisors of 45

Do (i, j) = { single element if i divides j


empty otherwise
Composition : only one choice reaUy.
Identities : i always divides it self.
In fact any poset (X,:() gives a category with objects the clements of X,
and with
X (i, j) = { one element if i :( j
empty otherwise
It is sometimes useful to labeI the single element in 8uch an X (i, j) where
i :( j by the pair (i,j) sa that the composition then gives the simple
formula
(i,j).(j, k) = (i, k).
Categories generalise partially ordered sets by aUowing multiple arrows
between objects.

2. D 1 Objects - divisors of 45
Dl (i, j) = set of paths from i to j in the diagram of Div( 45).
Thus, for instance,

1 -t 5 -t 15 -t 45 }
Dl(1,45) = { 1-t 3 -t 15 -t 45
1 -t 3 -t 9 -t 45

Note that aU paths (following the arrows) are used and are considered to be
distinct. This category Dl is caUed the fr'ee category an the Hasse diagrarn
of Div(45). That diagram is a directed graph and given any directed graph,
r, define
C:= FCat(r)
by
Objects (C) = V(r), the set of nodes or vertices of r;
C(u, v) = the set of directed paths in r from vertex u ta vertex v.
Composition : concatenation of paths.
Identity : 'empty path' at a vertex.
As the notation suggests, this category is called the free categor'y an r.
3. Our next example D 2 will be more complicated than a category.
In a category, C, each C(i,j) is a set; in an enriched category, it will have
more structure - in our example, each D 2 (i, j) will itself be a category
(in fact a poset):
Categorical Language and Hierarchic:al Models for Cel! Systems 295

As before, the objects of D'2 will be the divisors of 45 and D 2 (i, j) will be
empty if i does not divide ji however if i does divide j then D 2 (i,j) will
be the poset Div(j li) of divisors of the quotient of j by i. For example
D 2 (3,45) = Div(15) and so its Hasse diagram looks like

If i, j and k are in D2, there is an obvious mapping

D 2 (i,j) x D 2 (j, k) -+ D 2 (i, k)


a b ab

given by the product . of numbers - this mapping preserves the order, and
so is called an 'enrichcd composition'. Thus D 2 is an example of an 'order
enriched' category. (Uses of order cnriched categories relevant to the sub-
ject matter of this chapter include the book by Arbib and Manes already
mentioned [5] and Goguen's book [8] in which a special form there calleei
a ~-catcgory is used to provide structure for sign systems within algcbraic
semiotics. )

4. Some important further examples are called 'Big' categories because there
are so many objects (technically, the objeets do HOt form a set).
The generic form of thcse is:
All rnathematical objects with some specified structurc
All rnorphisms, i.e. mappings that preserve that structure
Here are some cxarnples:
Name Objeets Morphismsl arrows

Scts Sets Functions

Poscts Partially ordered scts Monotone funetions

Spaces Topological spaces Continuous functions

Groups Groups Homomorphisms

Cat Small categories Functors


5. Given any catcgory C we can form another category cop, calleei the oppo-
site or dual category of C, which has the same objeets as C but the arrows
296 R.Brown, R.Paton, and T.Porter

are those of C 'reversed'. Thus

COP(i,j) == C(j,i)
with composition suitably adjusted. For posets, this corresponds to putting
the reverse order on the poset and so to turning the Hasse diagram upside
down.
Because of this any construction applied to categories comes in two
flavours, since the construction can also be applied t.o the opposite cat-
egory; these two are usually distinguished by giving one of them the prefix
'co'. Thus we will have below limits and colimits, and for different appli-
cations we favour one rather than the otheL We will be chiefly concerned
with colimits which are useful for describing how a structure is made from
components by putting them together: the device for describing 'put.ting
t.hem together' is called a 'cocone'.

COlilllits
As an illustrative example of this we give the least common multiple
lcm(a, b) of two numbers a, b. Thus lcm(a, b) is the least of the common
multiples of a, b and so lcm(9, 15) == 45, lcm(3, 5) == 15, and sa ono
We do need a bit more precision, of course. In the natural numbers, 1,2,
... , we say c is a common multiple of a and b if c is a multiple of a (sa
there is some d with c == da), and c is a multiple of b (so there is some e
with c == eb). We note that in the diagram for D'iv(n) for any such c, a, b, we
would have

a b
This diagram is a cocone (by convention, not a cone) on the pair a, b. It
says c is a common multiple of a and b.
Now bring 'least' into play:

r
lcm(a, b)

a
/~ b
Categorical Language and Hierarchical Models for CeH Systems 297

This type of situation can be abstracted and generalised to give the notion
of 'colimit'. The 'input data' for a colimit is a diagram D, that is a collection
of some objects in a category C and some morphisms between them. So a
diagram D could look like this:

D= ;> •

/ / .

/
~

Now \Ve need the notion of a cocone with base D and vertex an object C.
This wiU look like:

We also have to make the condition that each of the the triangular faces of
this cocone is commutative, where a triangle of morphisms

is commutative means fg = h. In the case the category is a poset this condi-


tion is automatic.
The output wiU be an object colim(D) in our category C defined by a
special colimit cocone such that any cocone on D factors through the col-
imit cocone. The commutativity condition on the cocone in essence forces
interaction in the colimit of different parts of the diagram D.
In the next picture the coli mit is written • = colimit(D) and the dotted
arrows represent new morphisms which combine to make the colimit cocone:
298 R.Brown, R.Paton, and T.Porter

/'
/'
/'
/'

.-\.."",.... . . ...:
/'
/'
/'

~ \>,:.. :~::

1 .....

Again, all triangular faces of the combined picture are commutative. Now
stripping away the 'old' cocone gives the factorisation of the cocone via the
colimit:

/'
/'
/'
/'
/'
/'
/'
/'
/'

!i.':"''''';::>:.::: ...
~ .... . ...

.~ .. .

Intuitions:
The object colim(D) is 'put together' from the constituent diagram D by
means of the colimit cocone. From beyond (or above in our diagrams) D, an
object C 'sees' the diagram D 'mediated' through its colimit, i.e. if C tries to
interact with the whole of D, it has to do so via colim(D). The colimit cocone
is a kind of program: given any cocone on D with vertex C, the output will
be a morphism colim(D)-+ C.
Example
The lcm is the colimit of the diagram
Categorical Language and Hierarchical Models for Cell Systems 299

a~ /1
gcd(a, b)

The gcd, from a lower level of the hierarchy, 'measures' the interaction of a
and b.
Some people have viewed models of biological organs as colimits of the
diagrams of interacting cel!s within them.
WARNING. Often colimits do not exist in a category C for some diagrams.
However, one can add colimits in a complet ion process, i.e. freely for a class
of diagrams, and then compare these 'virtual colimits' with any that happen
to exist. An example of this process would seem to be the introduction in
neural net theory in the 1980s of the notion of 'virtual neurons' or 'assem-
blages' where an interacting subnet of 'neurons' exhibited behaviour as if it
was a (more powerful) single neuron. Perhaps the superstates considered by
Bel! and Holcombe [9] are similarly 'formal' colimits. Instances of biologic al
situations that lead to diagrams of this sort and hence to colimits occur in
the work of Ehresmann and Vanbremersch, mentioned below in more detail,
in Dioguardi [10] where they are used to model 'the hepatone', which is a
model of the interaction of the major cel! types in the liver, and, generical!y,
in the discussion of 'glue' in the study of integrative biology by the second
author ([11, 12]). Dioguardi's formulation of the 'hepatone' and subsequent
extension to incorporate also the 'hepatonexon' is an important illustration of
the use of mathematical thinking to clarify the idea of biological functioning
units. It is especial!y interesting as a number of models have been proposed
for units of hepatic function.
It is important to note that a colimit with its defining cocone has more
structure than merely the sum of its individual parts, sin ce it depends on
the arrows of the diagram D as well as the objects. Thus the specification
for a colimit object of the cocone which defines it can be thought of as a
'subdivision' of the colimit object. It would be interesting if the folding of a
one-dimensional protein sequence to a three-dimensional functioning struc-
ture could be seen as a colimit operation.
Much of this has been leading up to an introduction to the notion of a
(categorical model for a) hierarchical system as formulated by Ehresmann
and Vanbremersch [2].
They consider a basic category 1HI with specified objects and arrows. A
partition of Obj (IHI), the set of objects of 1HI into p + 1 classes (levels) labelled
300 R.Brown, R.Paton, and T.Porter

0,1, ... ,p such that each object AI at level n + 1 (with n < p) is thc coli mit
in JH[ of a diagram A of linked objects at level n.
We refer the re ader to the papers of Ehresmann and Vanbremersch aud
their web page
http://perso.wanadoo.fr/vbm-ehr/
and also to related pages such as that of Amiguet
http://iiun.unine.ch/people/mamiguet/index.html
Queries
(i) Why here is A only made up of objects anei links at level n? In 'ce11
systems', does one need shared objects of lowcr levels within the diagram?
(ii) How can one handle mathematica11y, and then computationally, thc
properties of AI, given knowledge of A?
Parting Thoughts
(a) To model manufacturing control systems, models such as Petri ncts,
timcd event graphs, etc. exist in numerous ftavours, stochastic, !1J,zzy, etc.
These seem 'enriched versions'. Is there a way of handling hicrarchical systcms
in which these 'enrichments' play a significant role. Some small progress has
been made in this direction - but so far it is inconclusive. For a model of
computation using enriched categories see [13].
(b) E- V hiemrchical systems try to model ce11 systems and do consider
weighted arrows. Would a variant of their theory. but using (poset or lattice)
enriched categories enable an amalgam of thcir rich conceptual basis with the
rich computational machinery already dcvelopcd from (a) above?
(c) Are there 'formallanguage' aspects of the hierarchical systems, capable
of providing models for cell systems?
Therc is a good practical interpretat ion of linear logic for manufacturing
systems [14]. There are many levels of manufacturing process in cellular sys-
tems - from the synthetic processes producing intracelllllar post-translational
prodllcts tn material targeted for export (notably protein products). One also
needs to model the vertical as well as the horizontal informat ion proccssing
involved in such multiple levels of ce11ular interaction.
(d) What might be a feasible successful biological model in the above
context? Reca11 that fractals have becn considered successful becausc they
showed that complex variation could result from a very simple model. How-
ever, many fractals are very simple, since they are defined by iterated function
systems, based on iterates of a single map. Examplcs need to be developed
of the next level of complexity, where also some actual computation, rather
than experimentation, is feasible because of the algebraic c:onditions imposed
by the struc:ture of the system. Thus a researc:h programme would be to com-
bine the algebra of rewriting [15], whic:h considers the c:onsequences of rules,
Categorical Language and Hierarchieal Models for CeH Systems 301

with some continuous variat ion as in fractals, to see how a range of 'colimit
structures' can develop. A generalisation of rewriting to categories and to
actions of catcgories is given in [16].
(e) We should also note the work of Dampney and Johnson on informat ion
systems [17], which showed that simple commutative diagram confliderations
could have useful consequences in simplifying a complex system (and so sav-
ing money). Since efficiency is of importance for biological systems, we would
hope to find examples of analogous considerations.
f) Another potential area for development is that of 'higher dimensional
algebra', see the Introduction given in [18]. This shows that one of the con-
tributions of category theory is not only to give a useful general language
for describing structures but also, in a self-reference mode, that in order to
describe the array of structures which have arisen in mathematics new math-
ematical structures have been found needed, and that these structures have
proved of independent interest. Not only that, a crucial step in developing
category theory is to have an algebraic operation, composition of arrows,
which is defined under a geometric condition, that the source of one arrow
is the target of the other. So one is led to envisage more general kinds of
compositions. An overall slogan in one aspect of the applications of these
more general structures was:
Algebraic inverses to subdivision.
That is, we may know how to cut things up, sub divide them, but do we have
an adequate algebra which encodes the structure and rules which govern the
behaviour of the re suIt of putting them together again? It was found, as is
described with references to the literature in [18], that there are forms of
what are called multiple categories which do have convenient properties in
some situations in this regard. These ideas have led to new mathematics
which has enabled new descriptions and new computations not available by
other means. The enriched categories to which we referred earlier can also be
regarded as forms of multiple categories.
The situation is even more elegant in that we generally think that com-
position is described mathematically by forms of algebra. There is a growing
body of mathematics called 'co-algebra' (see for example [19]) which seems
to give a possible language for subdivision. The combination of these two
strands of composition and subdivision could well be important for broader
applications in the future.
Another theme related to 'algebraic inverses to subdivision' is 'non-
commutative methods for local-to-global problems'. See [18] for an example
of how a two-dimensional structure proposed in 1932 for geometric purposes,
and in which the operations were always defined, was found to reduce to
302 R.Brown, R.Paton, and T.Porter

a comrnutative one. It is well known that one aspect of the foundation of


quantum mechanics was the introduction of non-commutative operations:
doing an observation A and then an observation B wiU not nccessarily give
the same result as in the other order: in symbols, we may have AB =F BA.
Higher dimensional algebra gives almost an embarasse de richesse of such
non-commutative structures, through the use of operations which are defined
only under geometric conditions. It is stiU early days, but intuition suggests
that we require a rich form of mathematics, and one in which algebra is partly
controlled by geometry, for new descriptions of the ridmess of complication
of life forms.
(g) Finally, we mention that the idea of structures cvolving over time
can be incorporated in categorical models by considering categories varying
over time, so that the colirnits evolve within the cat.egories. Further, forrns
of rnultiple categories have generalised notions of colirnits, and 80 of ways
of building a 'strueture' out of parts. Again, we can consider adding a tirne
parameter on such a multiple category, so that it and its internal structures
are evolving with tirne.

4 Conclusion

We hope that pointing out the existence of this categorical rnathernatics wiU
help the formulat ion of applications and also suggest ways to new forrns
of mathernatics required for the biological applications. Category theoretic
applications to biological systems such as those of Rosen, Ehresrnann and
Vanbremersch, and the chapters in this volume by .vlalcolm and Fisher, by
Paton, by Wolkenhauer, help to strengthen the importance of relational as
well as hierarchical thinking in biology.

References

1. R. Rosen, Life itself, Columbia University Press, New York (1991).


2. A. C. Ehresmann and J.-P. Vanbremersch, Hierarchical Evolutive Systems:
A mathematical model for complex systems, Bull. of Math. Biol. 49 (1)
(1987) 13-50. (For a full list of thcir work see: http://perso.wanadoo.fr/vbm-
ehrj AngjPubli2T.htm)
3. S. Eilenberg and S. Mac Lane, The general theory of natural equibvalences,
Trans. Amer. Math. Soc. 58 (1945) 231-294.
4. J. Lambek, and P. J. Scott, Introduction to Higher Order Categorical Logic,
Cambridge University Press, Cambridge, UK, 1986.
Categorical Language and Hierarchical Models for CeH Systems 303

5. M. A. Arbib and E. G. Manes, Algebraic approaches to program semantics,


Springer-Verlag, I3erlin Heidelberg New York (1986).
6. J. Meseguer, A logical theory of concurrent objects and its relation to the
MAUDE language, In: Agha, G., Wegner, P., Yonezawa, A. (Eds.) Research
Directions in object oriented based concurrency, MIT Press, Cambridge, Mass.,
314-390. (see also http://maude.csl.srLcom/papers/)
7. M. J. Fisher, G. Malcolm and R. C. Paton, Spatio-logical processes in intracel-
lular signalling, Biosystems 55 (2000) 83-92.
8. J. Goguen, An introduction to algebraic semiotics with application to user in-
terface design, Computation for Metaphor, Analogy and Agents, C. Nehaniv,
Ed., Springer Lecture Notes in Artificial Intelligenc 1562 (1999) 242-291.
9. A. BeII and M. Holcombe, Computational models of cellular processing, in:
Computation in Cellular and Molecular Biological Systems, R. Cuthbertson,
M. Holcombe and R. Paton, eds., Singapore: World Scientific (1996).
10. N. Dioguardi, Fegato aPiu Dimensioni, Etas Libri, RCS Medecina, Milan.
11. R. C. Paton, Glue, verb and text metaphors in biology, Acta Biotheoretica 45
(1997) 1-15.
12. R. C. Paton, Process, structure and context in relation to integrative biology,
Biosystems 64 (2002) 63-72.
13. F.Gadducci, and U. Montanari, Enriched Categories as Models of Computation,
in: Alfredo De Santis, Ed., Fifth Italian Conference on Theoretical Computer
Science, pp. 20-42, World Scientific, Singapore, 1995.
14. F. Girault, Formalisation en logique linaire du fonctionnement des reseaux de
Petri, These, LAAS, Universite Paul Sabatier Toulouse, Dec., 1997.
15. F. Baader and T. Nipkow, Term Rewriting and All That, Cambridge University
Press, Cambridge, UK, 1998.
16. R. Brown and Anne Heyworth, Using rewriting systems to compute lefi Kan ex-
tensions and induced actions of categories, J. Symbolic Computation 29 (2000)
5-31.
17. C. N. G. Dampney, M. Johnson. On the value of commutative diagrams in
information modelling, Springer Workshops in Computing, eds. Nivat et al,
1994, 47-60, Springer, London.
18. R. Brown and T. Porter, The intuitions of higher dimensional algebra for the
study of structured space, Seminar at the series of G. Longo 'Geometrie et
Cognition', Ecole Normale Superieure, May, 2001.
19. S. Krsti, J. Launchbury, and D. Pavlovic,Categories of Processes Enriched in
Final Coalgebras, SLNCS 2030 331ff.
Mathematical Systems Biology:
Genomic Cybernetics

O. Wolkenhauer
Systems Biology & Bioinformatics Group, Department of Computer Science,
University of Rostock, www.sbi.uni-rostock.de
o. wolkenhauer@umist.ac.uk

W. Kolch
Institute of Biomedieal and Life Science, University of Glasgow,
Cancer Research UK, Beatson Laboratories, Garscube Estate,
Switchback Road, Glasgow G611BD, UK

K.-H. Cho
School of Electrical Engineering, University of Ulsan, Ulsan, 680-749,
South Korea

Abstract. The purpose of mathematical systems biology is to investigate gene ex-


pression and regulation through mathematical modelling and systems theory in
particular. The principal idea is to treat gene expres sion and regulatory mecha-
nisms of the ceH cycle, morphologieal development, ceH differentiation and signal
transduction as controHed dynamie systems.
Although it is common knowledge that cellular systems are dynamic and regu-
lated processes, to this date they are not investigated and represented as such. The
kinds of experimental techniques, whieh have been available in molecular biology,
largely determined the material reductionism, which describes gene expression by
means of molecular characterisation .
. Instead of trying to identify genes as causal agents for some function, role, or
change in phenotype we ought to relate these observations to sequences of events.
In other words, in systems biology, instead of looking for a gene that is the reason,
explanation or cause of some phenomenon we seek an explanation in the dynam-
ies (sequences of events ordered by time) that led to it.
In mathematical systems biology we are aiming at developing a systems theory
for the dynamies of a ceH. In this text we ftrst deftne the concept of complexity in
the context of gene expression and regulation before we discuss the challenges and
problems in developing mathematical models of cellular dynamics, and provide an
example to iIIustrate systems biology, its challenges and perspectives of this
emerging area of research.
306 o. Wolkenhauer el al.

1 Introduction: Action versus Interactions

Gene expres sion is the process by which information stored in the DNA is trans-
formed via RNA into proteins. While the availability of genome sequences is
without doubt a revolutionary development in the life sciences, providing a basis
for technologies such as microarrays, the principal aim of the post-genome era is
to understand the organisation (structure) and dynamies (behaviour) of genetic
pathways. The area of genomics reflects this shift of focus from molecular charac-
terisation of components to an understanding of the functional activity of genes,
proteins and metabolites. This shift of focus in genomics requires a change in the
way we formally investigate cellular processes: Here we suggest a dynamic sys-
tems approach to gene expres sion and regulation, an approach we refer to as sys-
tems biology or genomie eybernetics.
Later we are going to provide an example for intracellular dynamics by means
of a mathematical model for a signalling pathway. However, looking at cells inter-
acting in the morphological development of an organism provides another exam-
ple for the importance of a dynamic-systems perspective of gene expression and
regulation. For differentiation of cells in development we find that the relation be-
tween the genome of a cell and the reactions which occur in the cells we require a
conceptual framework for both spatial and temporal aspects in order to capture the
relationship between an intern al programme and dynamic interactions between the
ceH and its environment. The environment may be other cells, physical constraints
or external signals to which the cellular system can respond. While we suppose
that the cells in a developing organism can possess the same genome, they never-
theless can develop and respond completely differently from one another. To an-
swer why and how this can happen one ought to study gene expression as a tem-
poral process. The principle chaHenge for systems biology is then to answer the
following questions [adopted from 1]:
1. How do cells act and interact within the context of the organism to generate
coherent wholes?
2. How do genes act and interact within the context of the cel! as to bring about
structure andfunction?
Asking how genetic pathways are dynamieally regulated and spatially organ-
ised, we distinguish between the aetion and interaetion of genes and cells respec-
tively (intra- and intercellular dynamics). For example, considering morphological
development, to what extent do genes control the process or do genes only partici-
pate in a reactive fashion? Many decisions in development are induction events
mediated by the contact with the surroundings. The multicellular context therefore
determines what happens to the individual cell. For example, cancer cells have lost
this ability to respond and therefore disregard tissue organisation and grow unre-
strictedly and invasively. It seems that cells and eventually organs have an inher-
ent developmental programme which they execute unless instructed otherwise.
Since the 1960s it is known that the most basic cellular processes are dynamic,
feedback controlled and that cells display anticipatory behaviour. In the 1960s,
investigating regulatory proteins and the interactions of allosteric enzymes, Fran-
cois Jacob and Jaques Monod introduced the distinction between 'structural genes'
Mathematical Systems Biology: 307

(coding for proteins) and 'regulatory genes', which control the rate at which struc-
tural genes are transcribed. This control of the rate of protein synthesis gave the
first indication of such processes being most appropriately viewed as dynamic sys-
tems. With the lack of experimental time-course data, mathematical models of
gene regulatory networks have so far focused on ordinary or stochastic differential
equations and automata [2, 3]. For such models to be specific they only consider a
small number of genes and for simulations of many genes interacting, the relation
to experimental data is lost. The problem, also known as Zadeh's uncertainty prin-
ciple is further discussed below. It is clearly important to explore the principal
limits of how we can balance the composition of components on a large scale, pre-
serving the integrity of the whole system, with the individuality of its components,
and without losing too much accuracy on the small scale. Since the two organisa-
tional levels (gene versus genome or ceH versus tissue/colony) are very different
with regard to how we can observe and represent them, different areas of research
have evolved around these organisational and descriptional levels. For example,
while differential equations have been used to develop accurate or predictive
models of individual genes in a particular organism and context [2], Boolean net-
works modeling hundreds and thousands of interacting genes have been successful
in capturing evolutionary aspects at the genome level [4].
The challenge is to develop a conceptual framework, which integrates these
models through abstraction (i.e., generalisation). For even the simplest of biologi-
cal systems we find that a whole range of techniques, ranging from time series
analysis (regression models), dynamic systems theory (rate equations, behavioural
models), automata theory (finite state machines) and various others are likely to be
considered. The validation and evaluation of any mathematical model with ex-
perimental data will further require pattern recognition techniques such as multi-
variate clustering and component analysis. There is therefore agreat need for inte-
gration of mathematical models and to formalise the modeling process itself.
Possible approaches which may be able to integrate or unify these distinct meth-
odologies are briefly discussed in the following section.

2 Integrating Organisational and Descriptional


Levels of Explanation

Depending on what biologic al problem is investigated, a number of quite distinct


mathematical concepts are used to represent the system under consideration.
While it is often possible to take alternative perspectives on the same problem,
there are situations in which a certain conceptual framework is more 'natural' and
has been established as the most appropriate representation. An important question
for mathematical modelling in the post-genome era is therefore to compare and
contrast different organisational and descriptional levels and to identify the most
appropriate mathematical framework. Some interesting questions arising from this
are:
308 O. Wolkenhauer et al.

• Why are there alternative formal representations?


• What are the limitations of formal representations, and how do these depend on
the available experimental data as weB as the descriptional and organisational
level of the system under consideration?
• How can we relate and combine different mathematical models?
An investigation into the questions above would generate a 'wish-list' of
mathematical research that is required to address the challenges provided by post-
genome life science research. While the question of how to integrate mathematical
models is relatively new, the need to integrate various software tools has long
been recognised in the area of bioinformatics. Over the last few years a number of
software tools have been developed to describe various aspects of gene expres sion
and regulation. Depending on which organisational or descriptional level of the
biological problem is addressed, these tools are usually not alternatives but com-
plement each other. It is therefore generaBy recognised that there is no all-in-one
package providing a solution but instead a common interface is necessary. The
'Systems biology workbench' and 'systems biology markup language' [5] are the
result of such considerations. The present text is to suggest a complementary ef-
fort at the theoretical (mathematical) level.
In modelling gene expression and regulation we are particularly interested in
representing intra- and intercellular dynamics by combining two modelling para-
digms: Components (cells or the expres sion of particular genes) are represented by
continuous dynamics, i.e. rate equations (differential or difference equations)
based on the well-known enzyme kinetics in biochemistry while multi-cellular dy-
namics are modelled using discrete representations such as finite state machines
(discrete event modeling) [6-8].
For a formal representation, one possible conceptual framework which possibly
could unify these different mathematical models is closely related to Rosen's
metabolic-repair Of (M,R)-systems [9, 10]. Rosen uses category theory to discuss
limitations of reductionism and modelling in the Newtonian realm. Another im-
portant application of category theory to biological systems is the memory evo-
lutive systems (MES) of Ehresmann and Vanbremeersch
[http://perso.wanadoo.fr/vbm-ehr/AnintroT.htm] . Ehresmann and Vanbremeersch
have developed a mathematical model for open, self-organised, hierarchical
autonomous systems with memory and the ability to adapt to various conditions
through a change of behaviour. We shall here adapt Rosen's (M,R)-systems as
transformation-regulation or (T,R)-systems to reflect the more general application
to gene expres sion and regulation. The formal representation of gene expres sion
and regulation therefore addresses two aspects: transformation and regulation. The
concept of regulation is either represented explicitly by control components, or re-
ali sed implicitly as an emergent phenomenon (e.g. self-organisation).
The first step in this approach is to introduce initially two mathematical spaces
(domain and co-domain) representing either abstract or material objects. For ex-
ample we may want to relate genes with function; substrates with products or as in
the context of time course experiments, we relate sequences of events. In any case,
a component or system is subsequently represented by a mapping between the as-
sociated spaces. This mapping represents some transJormation, which itself is
Mathematical Systems Biology: 309

regulated through further maps from the previously introduced co-domain and the
set of mappings between the two spaces. While Rosen captured this transforma-
tion-regulation process using category theory, it is possible to derive conventional
models such as automata, state-space representations and regression models from
them [11]. In [12] we discussed how automata and state-space models can be con-
sidered as special cases (or 'realisations') of (T,R)-systems. The shift of focus
from molecular characterisation to understanding the dynamics of pathways in ge-
nomics is reflected in the change of the definition of the objects in the domain and
co-domain to become sequences of data obtained from time course experiments.
Further below we return to the discussion about how the change of thinking in ge-
nomics should be reflected in mathematical modelling of biological systems.
Constraints on the nature of mappings and therefore the c1ass or categories of
functions and its structure arise 'naturally' from biological considerations. For in-
stance, gene products usually have more than one biological function which fre-
quently depends on the state of the cell (metabolic, other signaling, etc.). To give
one extreme example, beta-catenin is a structural protein of cell-cell adhesions at
the cell membrane, where it helps in gluing cells together. However, it also can
work as a transcription factor in the nuc1eus as the endpoint of the so-called wnt
pathway, which is an extremely important developmental pathway. Any devia-
tions from expected behaviour have catastrophic consequences in the development
of the organism. Thus, a mapping or the class of mappings must be able to ac-
commodate dynamic changes. Sometimes two different genes may lead to the
same biologic al function. Gene knock-out studies show that the function of a de-
leted gene can sometimes be replaced by another gene or genes. For instance,
there are several Ras genes, three of which have been knocked out in mice: Har-
vey-Ras, Kirsten-Ras and N-Ras. H-Ras and N-Ras knock-out are almost normal,
but the K-Ras knock-out is lethal. The work of Casti [11, 13], which extends
Rosen's work on (M,R)-systems and considers regulation in dynamic metabolic
systems, could provide an interesting starting point to investigate this problem.
Conventional systems theory considers inputs (independent variables) trans-
formed into outputs (dependent variables). The inputloutput point of view, al-
though suitable for the engineering and physical sciences, is unsuitable for cellular
systems or gene networks as these systems do not have an obvious signal flow of
direction. In contrast, in the 'behavioural approach' [14] systems are viewed as de-
fined by any relation among dynamic variables and a mathematical model is de-
fined as a subset of a universum of possibilities. Before we accept a mathematical
model as an encoding of the natural system, all outcomes in the uni verse are pos-
sible. The modelling process then defines a sub set of time-trajectories, taking on
values in a suitable signal space, and thereby defines a dynamic system by its be-
haviour rather than its inputs and outputs. While the definition of causal entail-
ment via designated 'inputs' and 'outputs' remains the primary objective for the
biological scientist, its definition follows that of a dynamic system in terms of
time trajectories. Willems' behavioural framework fits therefore very well the
situation in which we obtain experimental data. For example, microarrays provide
us with large sets of short time series for which dependencies have to be identified
from the data rather than being defined a priori.
310 O. Wolkenhauer et al.

Microarrays are one of the latest breakthroughs in experimental molecular biol-


ogy and allow the monitoring of gene expression for tens of thousands of genes in
parallel and in time. For a comprehensive representation of gene expression cur-
rent microarray technology lacks resolution and the activity of post-translational
factors in regulation remains undetected by it. Many molecules that control ge-
netic regulatory circuits act at extremely small intracellular concentrations. Resul-
tant fluctuations in the reaction rates of a biochemical process (e.g. a signalling
pathway) cause large variations in rates of for example development and mor-
phology. Most of the changes that matler must therefore be comparatively large by
their very nature, at least for a short period of time to be observable with microar-
rays. A problem is that one tends to look at large populations, e.g. bacterial cells in
a colony grown on a Petri dish. Even massive changes occurring in single cells
will appear small, if they do not occur synchronised within a small window of
time. Nevertheless the technology is progressing and one can expect that some of
these technical limitations will be overcome to allow system identification from
time series data [12, 15].

3 Scaling and Model Integration

On an empiricallevel a complex system is one that exhibits the emergence of un-


expected behaviour. In other words, a (complex) system is defined as an organised
structure of interdependent components whose properties and relationships are
largely determined by their function in the whole. Here we shall adopt a notion of
complexity that reflects our ability to interact with the natural system in such ways
as to make its qualities available for scientific analysis. In this context, by 'analy-
sis' we understand the process of encoding a natural system through formal sys-
tems, i.e. mathematical modelling. The more independent encodings of a given
natural system that can be built, the more complex the system is. Complexity is
therefore not just treated as a property of some particular mathematical model; nor
is complexity entirely an objective property of the natural system. Summarising,
the complexity of biologic al systems we identify complexity as:
• A property of an encoding (mathematical model), e.g. its dimensionality, order
or number of state-variables
• An attribute of the natural system under consideration, e.g. the number of com-
ponents, descriptive and organisationallevels that ensure its integrity
• Our ability to interact with the system, to observe it, i.e. to make measurements
and generate experimental data
On alI three accounts, genes, celIs, tissue, organs, organisms and populations
are individually and as a functional whole a complex system. At any level, the no-
tion of complex systems and the implicit difficulties in studying them are closely
related to the specific approach by which we proceed. On a philosophical level
this is related to epistemological questions while for scientific practices this relates
to the choice of a particular methodology (e.g. Bayesian approach) or model (e.g.
Mathematical Systems Biology: 311

differential equations). We return to the choice of an appropriate mathematical


model further below.
In dynamic systems theory, one would initially ignore spatial aspects in the
analysis of ceH differentiation. This approach is usually limited because both
space and time are essential to explain the physical reality of gene expression. The
fact that the concepts of space and time have no material embodiment (they are
not in the molecules or their DNA sequence) has been an argument against mate-
rial reductionism. Although this criticism is in principle correct, alternative meth-
ods are in short supply. The problem is that, although components of ceHs have a
specific location, these locations lack exact coordinates. Without spatial entail-
ment there can be no living ceH but for formal modelling we would require a topo-
logical representation of this organisation. Notwithstanding the fact that for exam-
ple for larger diffusion times we ought to consider partial differential equations in
biokinetic modelling, the complexity of these models forces us frequently to com-
promise. It is the movement of molecules which raises most concern to the model-
ler; location or compartmentalisation can be dealt with an increased number of
variables covering regions.
Although the environment of a ceH is always taken as one of the essential fac-
tors for ceH differentiation, it will be difficUlt to separate external from internal
signalling in the analysis of experimental data. A key problem is then how we can
generali se from a model which assumes physiological homogeneity as weH as a
homogenous or closed environment, to a model that includes intraceHular bio-
chemical reaction dynamics; signalling, and ceH-to-ceH interactions?
Gene expression takes place within the context of a ceH, between ceHs, organs
and organisms. While we wish to 'isolate' a system, conceptually 'close' it from
its environment through the definition of inputs and outputs, we inevitably Iose in-
formation in this approach. (Conceptual closure amounts to the assumption of
constancy for the external factors and the fact that external forces are described as
a function of something inside the system.) Different levels may require different
modelling strategies and ultimatively we require a common conceptual framework
that integrates different models. For example, differential equations may provide
the most realistic modeling paradigm for a single-gene or single-ceH representa-
tion but cell-to-ceH, and large-scale gene interaction networks are probably most
appropriately represented by some finite state machine. In addressing the problem
of scaling and integration of models, there are two kinds of system representa-
tions:
• Intra-component representations in which the state of a sub-system or compo-
nent (e.g. ceH or gene) of a system is determined by a function (e.g. linking
state-variables in rate equations) and the evolution of states determines the sys-
tem's behaviour
• Inter-component discrete representations of 'whole' systems (e.g. clone, tissue
or genome), which do not define the state of the system explicitly but instead
the state emerges from the interactions of sub-systems or components ('ceHs as
agents')
A problem is how to combine these two very different representations. While a
clone or colony of bacteria might be described as optimising a global 'cost-
312 O. Wolkenhauer et al.

function', one could altematively consider cells as related but essentially inde-
pendent components with an intemally defined programme for development, in-
cluding mechanisms in response to environmental changes or inputs. The com-
parison and combination of both modelling paradigms could lead to a number of
interesting questions related to how the scientist interprets causal entailment in
biological systems.
In general, causation is a principle of explanation of change in the realm of
matter. In dynamic systems theory causation is defined as a (mathematical) rela-
tionship, not between material objects, but between changes of states within and
between components. In biology causation cannot be formally proven and a 'his-
torical approach' is the basis for reasoning, i.e., if correlations are observed con-
sistently and repeatedly over an extended period of time, under different condi-
tions and by different researchers, the relationship under consideration is
considered 'causal'. This approach is surprisingly robust, although exceptions
have been found to almost any single dogma in biology. For instance, some vi-
ruses contain RNA genomes which they copy into DNA for replication and then
have the host ceH transcribe it back into RNA.

4 Theory and Reality: Experimental Data and


Mathematical Models

Abstract, theoretical mathematical models have, so far, played little or no role in


the post-genome era of the life sciences. The use of mathematical or probabilistic
models has been mostly restricted to the justification of algorithms in sequence
analysis. Mathematical models of gene expression or gene interactions have either
been a theoretical exercise or are only concemed with the practical application of
multivariate techniques such as in the analysis of array data.
More abstract and hence general models are necessary and particularly useful in
situations that capture hierarchical systems con si sting of highly interconnected
components. For example, consider the development of blood ceHs; there it seems
that the primitive stern cells express a whole battery of so called 'lineage specific
genes', i.e. genes that are normally only expressed in a sub set of differentiated
cells such as B-cells or T-cells. During differentiation, which again is induced
from outside by hormones, growth factors and other stiH iH-defined cues, this
'mess' in gene expression is cleaned up and most genes are shut down. Thus, only
the genes which determine the proper lineage remain ono This is rather the oppo-
site of what one would expect. In the stern cell everything is on, and specificity in
differentiation is achieved by shutting of the expression of most genes and just
leaving a few selected ono
Two very fundamental aspects of life are 'transformation' (change) and 'main-
tenance' (replication, repair, regulation). Here these processes can be summarised
as 'gene expression' - the process by which information, stored in the DNA, is
transformed into products such as proteins. While in the past biologists have stud-
ied gene expression by means of 'molecular characterisation' (of material objects)
the post-genome era is characterised by a shift of focus towards an understanding
Mathematica1 Systems Bio1ogy: 313

of 'functional activity'. While the study of structural properties of proteins (e.g.


with the purpose to detennine its function) will continue to be a research area, it is
increasingly recognised that protein interactions are the basis for observations
made at the metabolic and physiologicallevel. This shift of perspective is possible
with new experimental technologies allowing for experiments that consider tem-
poral changes in gene expression. In other words, it now becomes possible to
study gene expres sion as a dynamic, regulated process.
The development of (Zermelo-Fraenkel) set theory in mathematics and the ma-
terial reductionism in biology have parallels in that both regard things as more
fundamental than processes or transformations. The limitations of the 'object-
centred material reductionism' in biology are generally accepted. The books by
Rosen and more recently those by Sole and Goodmann (Signs of Life) and Roth-
mann's Lessons from the Living CeH discuss these issues. Mathematicians have
developed with category theory a more flexible language in which processes and
relationships are put on equal status with 'things'. In other words, category theory
promotes a conceptual framework in which 'things' are described not in terms of
their constituents, but by their relationships to other things. There are other phi-
losophical reasons to consider such a relational perspective of biology. In particu-
lar, the philosophical system of Arthur Schopenhauer (who essentially refined
Immanuel Kant's work) provides a basis for a relational approach following from
the fact that always and everywhere each thinglobject exists merely in virtue of
another thing. But for anything to be different from anything else, either space or
time has to be pre-supposed, or both. Since causation is the principle of explana-
tion of change in the realm of matter, causation is subsequently a relationship, not
between things, but between changes of states of things.
In order to verify theoretical concepts and mathematical models we ought to
identify the model from experimental data or at least validate the model with data.
The problem of complexity appears then in two disguises:
• Dimensionality: hundreds or thousands of variables/genes/cells
• Uncertainty: small samples (few time points, few replicates), imprecision,
noise
Analysing experimental data we usually rely on assumptions made about the
ensemble of samples. A statistic al or 'average perspective' may, however, hide
short-term effects that are the cause for a whole sequence of events in a genetic
pathway. What in statistical terms is considered an outlier may just be the phe-
nomenon the biologist is looking for. It is therefore important to compare different
methodologies and to question their implicit assumptions with the consequences
for the biological questions asked. To allow reasoning in the presence of uncer-
tainty, we have to be precise about uncertainty.
For a systems approach, investigating causal entailment it is further necessary
to be able to systematically manipulate the system. At present the 'data mining'
approach is the prevailing technique to study genomic data but it is important to
realise that this will only allow us to investigate associations (e.g. quantified by
means of correlation analysis). The study of causal relationship can only be stud-
ied through a comparison of system behaviour in response to perturbations. This
not only imposes demands on the experimental design (being able to manipulate
314 O. Wolkenhauer etal.

certain variables according to specific input pattems to the system) but further
suggests that the systems biologist should be part of the experimental design proc-
ess rather than being 'delivered' a data set for analysis.

linearisation reduction

physico-chemical
simulation
principles

measurement parameter
and observation estimation

pre-processin realisation

Fig. 1. Mathematical modelling of biological systems can follow two routes - 'modelling',
guided by experimental data, and 'identification' from experimental data. In both cases, we
rely on numerous assumptions and simplifications [12]

Dnce the experimental design is completed and data are being generated, the
question of which kind of mathematical model and which structure it should have
arises. In the theory of dynarnic systems we generally have to make a decision
whether to regard the process as a deterministic non-linear system but with a neg-
ligible stochastic component or to as sume that the non-linearity to be only a small
perturbation of an essentially linear stochastic process. Genuine non-linear sto-
chastic processes have not yet been shown to be applicable for practic al time-
series analysis. Although natural phenomena are never truly linear, for a very large
number ofthem linear (stochastic) modelling is often the only feasible option. The
dilemma with, for example, microarray time course experiments is that hundreds
of variable are sampled at only a few sample points with replicates considered a
luxury. This naturally gives rise to questions regarding the limitations of stochastic
linear modelling in the context of such data.
An interesting question in the context of the semantics of mathematical models
is the role of 'noise' or random fluctuations in general. In biology, the role of ran-
dom variation is often illustrated with examples related to evolution and intrace1-
lular fluctuations of regulatory molecules. For the latter the question is usually an-
swered by the number of molecules involved, fewer molecules usually suggesting
a stochastic model while large numbers of molecules often permit a deterministic
model. While in the former case variation is an intrinsic aspect of the natural sys-
tem under consideration, a noise term in a description or formal representation is
often used to 'cover up' variations that cannot be explained with the given model
and hence relates to a limitation in the observation and explanation of the phe-
nomena. The question then is to whether a mathematical model is considered to
Mathematical Systems Biology 315

explain the underlying 'mechanism', which led to the observations. Or do we re-


quire a model which numerically predicts a particular variable or set of variables?
Many biological systems appear to require a certain arnount of noi se to reach a
state with optimal conditions (e.g. equilibrium). Random variations allow the sys-
tem to adapt to a changed environment. In the extreme, without noise a biological
system cannot react to change and a purely random system has lost its ability to
perform any regular function. This discussion leads to an argument for an optimal
'signal-to-noise' ratio and mathematical models which allow for a noise term. For
example, in time-series analysis Yule developed a conceptual framework in which
order (represented by a linear, parametric or autoregressive model) is obtained
from a sequence of independent random shocks (white noise process).
Noise in the form of random fluctuations arises in pathway modelling in two
ways. Internal noise is inherent in the biochemical reactions. The magnitude is in-
versely proportional to the system sise, and its origin is usually considered to be
thermal. On the other hand, external noise is a variation in one or more of the con-
trol parameters, such as the rate constants associated with a given set of reactions.
Extemal noise then drives the system into different attractors (i.e. fixed points,
limit cyc1es) of the dynamical systems model. If the noise level is considered
small, its effects can often be incorporated post hoc into the rate equations as an
additional term. On the other hand, if noise is the more dominant aspect, a sto-
chastic model may be a more appropriate conceptual framework to start with. Bio-
chemical processes typically only involve a small fraction of any given signalling
molecule. For instance, most receptors give a full biological response when only
10-20% of them are engaged by ligand. More ligand often even leads to an inhibi-
tion of responses. For this reason one type of signaling molecule can function in
several distinct pathways and exert completely different functions (this again
could be represented by a hybrid model). While random variations appear to be an
essential strategy for adaptation and survival, many regulatory pathways in cells
have highly predictable outcomes. This dynamic stability of genetic networks is
the result of redundancy and the interconnection of systems (loops). To faithfully
represent these phenomena using mathematical modeling we therefore need to to
model individual sub-systems as well as a collection of components into a com-
plex network.

5 Mathematical Systems Biology: Genomic Cybernetics

Systems biology is an emerging field of research focused on the application of


systems and control theory to molecular systems [10, 16]. It aims at a system-Ievel
understanding of metabolic or regulatory pathways by investigating interrelation-
ships (organisation or structure) and interactions (dynarnics or behaviour) of genes
(RNA transcripts, proteins) and the genome or cells (metabolites).
The biggest problem that any approach to mathematical modelling in biology
faces is well summarised by Zadeh's uncertainty principle which states that, as the
complexity of a system increases, our ability to make precise and yet significant
statements about its behaviour diminishes until a threshold is reached beyond
316 O. Wolkenhaueretal.

which precision and significance (or relevance) become almost exclusive charac-
teristics. Overly arnbitious attempts to build predictive models of cells or subcel-
luar processes are likely to experience the fate of historians and weather forecast-
ers - prediction is difficult, especially if it concems the future ... , and these
difficulties are independent of the time, arnount of data available or technological
resources (e.g. computing power) thrown at the problem.
The problem is that perturbations to cells have multi-gene / multi-transcript /
multi-protein responses, 'closing' the system, i.e., restricting the model to a small
set of variables, assuming constancy of some variables, inevitably leads to an of-
ten unacceptable level of uncertainty in the inference. In other words, the prob-
lems of applying systems theory in biology can be summarised by
(a) The difficulty of building precise and yet general models
(b) The 'openness' ofbiological systems, the fact that these systems are hierar-
chical and highly interconnected

6 Dynamic Pathway Modeling as an Example

We mentioned before the need to combine continuous representations (e.g. mass


action differential equations) and process algebras (formal languages such as 1t-
calculus). The exarnple given above was motivated by combining representations
of intra- and intercelluar dynamics. The problem of modelling signaling pathways
however is another good example in which the need for hybrid models has be-
come clear. Intracellular signaling pathways directly govern cell behaviour at cel-
lular, tissue and whole-genome level and thereby influence severe pathologies
such as cancer, chronic inflarnmatory disease, cardiovascular disease and neuro-
logical degeneration syndromes. Signal transduction mechanisms have been iden-
tified as important targets for disease therapy. Signalling modules regulate funda-
mental biological processes including cell proliferation, differentiation and
survival. These 'decisions' are arrived at by reaching thresholds in concentrations.
The duration of reaching threshold matters and while some processes are reversi-
bIe others are nor. While rate changes are best represented by differential equa-
tions, such switching into different 'operating modes' is best represented using a
'logical formalism'. Forward and backward biochemical reactions run in parallel
and 'compete', rendering sequential representations unrealistic. Rate equations
originate as a first approximation, whereby intern al fluctuations are ignored. These
deterministic differential equations describe the evolution of the mean value of
concentrations of the various elements involved. The existence of positive and
negative feedback in a regulatory network is considered common and leads to
non-linear rate equations.
Mathematical Systems Biology 317

NATURAL SYSTEM
Represents
Measurement ----+. Mathematical Modelling Simple Systems
(axioms, equalions, diagrams) Quantilatively

1 Precise Inference
----+.
j
Inferential Entailment but
Inaccurate Conclusions

Describes
Observation ---+. Empirical Analysis ----+. Complex Systems
(naturallanguage, diagrams, pictures) Qualitatively

1
Causal Enlailment
Accurale Conclusions
bul
Imprecise Reasoning

Fig. 2. There is an interesting contrast and complementarity between modelling in


the engineering and physical sciences and inference in biology
The MAPK signalling pathway dynamics are an example of a system which has
been investigated by a number of research groups with very different modelling
paradigms, inc1uding mass-action differential equations, Monte earlo simulations,
or process algebras. To this date, none of these is considered an all-round satisfac-
tory solution, providing a biologically faithful and transparent model that can be
verified experimentally. Intracellular signalling pathways carry signals from cell-
surface receptors (where the process known as signal transduction converted the
signal produced by activation of a cell-surface receptor) to their intracellular des-
tination. The information flow is realised by biochemical processes, implemented
by networks of proteins. These networks have been represented and visualised by
Petri nets, Boolean networks and other graph-based networks. A number of simu-
lation environments such as BioSpice, DBSolve, Gepasi, StochSim, ProMOt,
Diva, Cellerator, Vcell and E-cell amongst others are available and efforts such as
the Systems Biology Workbench and Systems Biology Markup Language are suit-
able computational tools to integrate and combine various tools. Here we shall
consider a sub-module of a signalling pathway and focus on a description of its
biokinetic reactions by means of (non-linear) ordinary differential equations. The
difficulties and challenges arising when this model is to be extended to cover most
of the aspects discussed previously will become apparent from the discussion be-
low.
318 O. Wolkenhauer et al.

Mitogens
GrOHth factors

Fig. 3. The Ras/Raf-IIMEK/ERK signalling pathway

The RaslRaf-llMEKlERK module (Fig. 3) is a ubiquitously expressed signal-


ling pathway that conveys mitogenic and differentiation signals from the cell
membrane to the nuc\eus. This kinase cascade appears to be spatially organised in
a signalling complex nuc\eated by Ras proteins. The small G protein Ras is acti-
vated by many growth factor receptors and binds to the Raf-l kinase with high af-
finity when activated. This induces the recruitment of Raf-l from the cytosol to
the cell membrane. Activated Raf-l then phosphorylates and activate
MAPKlERK Kinase (MEK), a kinase that in turn phosphorylates and activates Ex-
tracellular signal Regulated Kinase (ERK), the prototypic Mitogen-Activated Pro-
tein Kinase (MAPK). Activated ERKs can translocate to the nuc\eus and regulate
gene expres sion by the phosphorylation of transcription factors. This kinase cas-
cade controls the proliferation and differentiation of different cell types. The spe-
cific biologic al effects are crucially dependent on the amplitude and kinetics of
ERK activity. The adjustment of these parameters involves the regulation of pro-
tein interactions within this pathway and motivates a systems biological study.
Figs. 4 and 5 describe 'circuit diagrams' of the biokinetic reactions for which a
mathematical model is used to simulate the influence RKIP has on the pathway.
The pathway is described by 'reaction modules' (Fig. 4), each of which can be
viewed as a (slightly modified) enzyme kinetic reaction for which the following
set of differential equations is obtained:
Mathematical Systems Biology 319

T=
dx (t)
-kl 'X I(t)x 2(t)+k2 ·x (t) 3

dx7 (t)
~=-kl'XI(t)X2(t)+k2 ·x/t)+k3 ·x3 (t)

dx/t) _
-----;Jţ-kl ·X I (t)X2 (t)-k2 ·x/t)-k3 'X/t)
dx4 (t) _
-----;Jt-k3 ·x3 (t)-k4 'X4 (t)

The entire model, as shown in Fig. 5, is composed of these modules, leading to


what usually becomes a relatively large set of differential equations for which pa-

,
rameter values have to be identified.

substrate (S) enzyme (E)

complex (ES)

Fig. 4. The pathway model is constructed from basic reaction modules like this enzyme ki-
netic reaction for which a set of four differential equations is required
320 O. Wolkenhauer et al.

Raf-l* RKIP

9~

MEK-PP ERK RKIP-P RP


Fig. 5. Graphical representation of the ERK signalling pathway regulated by RKIP: a circle
O represents a state for the concentration of a protein and a bar Oa kinetic parameter of re-
action to be estimated. The directed arc (arrows) connecting a circle and a bar represents a
direction of a signal flow. The bi-directional thick arrows represent an association and a
dissociation rate at the same time. The thin unidirectional arrows represent a production
rate of products

As illustrated in Fig. 6, in the estimation of parameters from westem blot data,


the parameter estimates usually appear as a time-dependent profile since the time
course data include various uncertain factors such as transient responses, noise
terms, etc. However, if the signal transduction system itself is inherently time-
invariant then the estimated parameter profile should converge to a certain con-
stant value at steady-state. Therefore we have to find this convergence value if the
system is time-invariant. Otherwise we have to derive an interpolated polynomial
function of time for time-varying systems. For reasons of cost, logistics and time
management, for any particular system under study, concentration profiles are
usually obtained only for a relatively small number of proteins and for few data
points. One subsequently relies on values obtained from the literature. But even if
data are available, the estimation of parameters for non-linear ordinary differential
equations is far from being a trivial problem. For the parameter estimation shown
in Fig. 6, we discretised the given continuous differential equations along with a
sample time, which usually corresponds to the time of measurement. Then the
Mathematical Systems Biology 321

continuous differential equations can be approximated by difference equations.


This leads to a set of linear algebraic difference equations with respect to parame-
ters and regression techniques can be employed.

parame1er estlmallon for kl : k1 ~O. 53 parameter eslimalion tor k3: k3_0.625


0.7 r-:,...-....,-,..-..,..- - - - - - - , 0.7

~
ii;
0.5 .. 0.51----'...--__7Y
1>
~
i 0.5t:===~:=:=:~0K'~~t=:3 a
15
~ 0.4
.. 0.5
15
~ I I
I
I
_ 1 ___ J. ___ L ___ 1__ _
.: 0 .4 -
~O.3 I I I
~
0.3 '-----'-----'-----'-----'-----'
o 6 10
t time t: time

x 10-3parameler esUmation for k2~ k2~O.OO72 x 10~ parametef eSlimatlon for k4: k4=O.OO24S
8~~~~~---------, 3 r~
""'--C~
.k-U~
.1-
00""'--------'
~

1251-=:..!:~~...J-':~~.r""'-=9
'5 2
~
I I I
___ .i ___ L.. ___ 1__ _ ~ _ __ J. ___ L ___ 1__ _
=- 1.5
~
1'--_-'--_-'-_--'-_--'-_--'
o 10
I: llme t Ume

Fig. 6. IIIustration for parameter estimation from time-series data: the upper left shows
Raf-I */RKIP complex association parameter kl, the upper right shows Raf-
I */RKIPIERK-PP association parameter k3, the lower left shows Raf- I * and RKIP disso-
ciation parameter k2, and the lower right shows ERK-PP and Raf-I */RKIP complex disso-
ciation parameter k4

If a satisfactory model is obtained, this can then be used in a variety of ways to


validate and generate hypotheses, or to help experimental design. Based on the
mathematical model illustrated in Fig. 5, and the estimated parameter values as for
example obtained using a discretisation of the nonlinear ordinary differential equa-
tions (as illustrated in Fig. 6), we can perform simulation studies to analyse the
signal transduction system with respect to the sensitivity for the variation of RKIP
and ERK-PP. For this purpose, we first simulate the pathway model according to
the variation of the initial concentration of RKIP (RKIP sensitivity analysis - see
Fig. 7). Next we perform the simulation according to the variation of the initial
concentration of ERK-PP in this case (ERK-PP sensitivity analysis - see Fig. 8).
322 O. Wolkenhauer el al.

Suppress,on oI Ra~ 1 ',nase aCIMty by RKlP' Ra" Supp'ession of Raţ 1 kmase aclMly by RKlP ERK-P

-003
~002
Q.

~001
w
° .. '
50

o
°
reaclion tlme [seci concenlral,on 1!1M1 reacllon hme [sec) ooncentrallon 1!1M1
Supp,.ssion of R.~l kin ... aClml y by RKlP, RKlP Supp,ession 0/ Ra/· I kina.e aelMly by RKlP: RKlP·P
0.02
002

~
:tom
Si!
a::
o
50

o
°
reaclion I,me Isecl concenlralion 1!1M1 reOlr.tlOn lim81sac1 concen'r.,ion 1!1M1

Fig. 7. The simulation results according to the variation of the concentration of RKlP: The
upper left shows the change of concentration of Raf-l *, the upper right shows ERK, the
lower left shows RKlP, and the lower right shows RKlP-P

Supp,ession ofR,!- 1 kinase aC1M11 by RKlP. MEK·PP Suppt.ssion of Raf.1 kin... aelNily by RKlP' RKlP-P·RP

~2.5 ~Olli
Q. 0.04
8: 2 a;
::l:
w
~ 0.02
::< 15
50
Si!
a; °
50

reaction lim e [sec) o concenlral,on I!1MI reaction time lsecl o concenlral,on 1!1M1

Suppress,on of R,f.1 kinase oclM1y by RKlP. ERK·P-MEK·PP Suppr••sion of R,ţ 1 k,nas. IclMly by RKlP: RP

3
2.98
~2.96
Q:"2.94
a:: 2.92
50

reactio" limelsecJ
° concenlralion I!1MI ° reaction hma Isec) o concenl,al;on I!1MI

Fig. 8. The simulation results according to the variation of the concentration of ERK-PP
(continued): the upper Iert shows the change of concentration of MEK-PP, the upper right
shows RKlP-P-RP, the lower Iert shows ERK-P-MEK-PP, and the lower right shows RP

The kind of models and the modelling approach which we introduced here have
already proven to be successful (i.e. useful to the biologist) despite the many as-
sumptions, simplifications and subsequent limitations of the model. A challenge
for systems biology remains: how can we scale these models up to describe not
Mathematical Systems Biology 323

only more complex pathways but also to integrate information and capture dy-
namic regulation at the transcriptome, proteomic and metabolomic level.
Especially MAP kinase pathways have been investigated by various groups us-
ing a variety of mathematical techniques [17, 18] and the co-existence or generali-
sation of different methodologies raises questions about the biologic al systems
considered, for mathematical models to have explanatory power, hence being use-
fuI to biologists, the semantics or interpretation of the models matters. Do we as-
sume that a cell is essentially a computer or machine - executing logical pro-
grammes, is it a biochemical soup, an essentially random process or are
independent agents interacting according to a set of pre-defined rules? Is noise an
inherent part of the biological process or do we introduce it as a means to repre-
sent unknown quantities and variations?
Real-world problems and challenges to apply and develop research in the area
of systems biology are abounding. For example consider the development of
mathematical models used to analyze and simulate problems in development such
as what is sometimes called asymmetrical division. This describes the phenome-
non that when a stern cell divides, one daughter cell differentiates whereas the
other remains a stern cell. Otherwise our stern cells would get depleted. This phe-
nomenon happens although the dividing stern cell has the same pattern of gene
expres sion and is exposed to the exact same environmental cues. Another applica-
tion would be the mathematical modeling of differential gene expression and regu-
lation of transcription during a bacterial or viral infection. The theoretical work
could be guided by the analysis of DNA microarray data, which are available for a
number of organisms.

7 Summary and Conclusions

The discussion above outlined a dynamic systems framework for the study of gene
expres sion and regulation. We are interested in the interface between internal cel-
lular dynamics and the external environment in a multi-cellular system. A defini-
tion of complexity in the context of modelling gene expression and regulation is
given and the background and perspective taken are described in detail. While the
motivation is to investigate some fundamental questions of morphological devel-
opment, differentiation and responses to environmental stress, the proposal is to
focus these questions on a limited set of problems, methodologies and experimen-
tal techniques. The use of a model is to see the general in the particular, i.e., the
purpose of a mathematical model of cellular processes is not to obtain a perfect fit
to experimental data, but to refine the biologic al question and experiment under
consideration.
The central dogma of systems biology is the fact that the cell and its inter- and
intracellular processes describe dynamic systems. An understanding of regulatory
systems therefore requires more than merely collecting large amounts of data by
gene expression assays. If we are to go beyond association to an understanding of
causal entailment, we need to go beyond the data mining approach. The systems
approach is characterised by systematic manipulation of the system behaviour.
324 O. Wolkenhauer et al.

Reality is described as a continuous dynamic process, best represented as a sys-


tem of components realising a spatio-temporal relationship of events. The motiva-
tion comes from the fact that, despite the endless complexity of life, it can be or-
ganised and repeated pattems appear at different organisational and descriptional
levels. Indeed, the fact that the incomprehensible presents itse1f as comprehensible
has been a necessary condition for the sanity and salary of scientists. This princi-
ple is tested in systems biology with mathematical models of gene expres sion and
regulation for simple and yet complex biological systems.
If this document gives the impression that molecular biology, with its focus on
spatiaVstructural molecular characteristics, is failing to address temporal and rela-
tional aspects, so does systems and control theory miss the importance of spatial
or structural arrangements in its representations. The problem of how to combine
both temporal and spatial aspect in one model has been a major challenge in the
engineering and physical sciences and will be an even greater one for molecular
processes, which are consisting of a large number of interacting components.
With the shift of focus from molecular characterisation to an understanding of
functional activity in genomics, systems biology can provide us with methodolo-
gies to study the organisation and dynamics of complex multivariable genetic
pathways. The application of systems theory to biology is not new and Mihajlo
Mesarovic wrote in 1968 that 'in spite of the considerable interest and efforts, the
application of systems theory in biology has not quite lived up to expectations. one
of the main reasons for the existing lag is that systems theory has not been directly
concemed with some of the problems of vital importance in biology.' His advice
for the biologists was that progress could be made by more direct and stronger in-
teractions with system scientists. 'The real advance in the application of systems
theory to biology will come about only when the biologists start asking questions
which are based on the system-theoretic concepts rather than using these concepts
to represent in stiH another way the phenomena which are already explained in
terms of biophysical or biochemical principles. Then we wiH not have the "appli-
cation of engineering principles to biological problems" but rather a field of sys-
tems biology with its own identity and in its own right.'

References

R. Sole and B. Goodwin (2000): Signs of Life: How Complexity Pervades Bi-
ology? Basic Books, New York.
2 J.J. Tyson and M.C. Mackey (2001): Molecular, Metabolic and Genetic Con-
trol. Chaos, VoI. 11, No 1, March 2001 (Special Issue).
3 J. Hasty, D. McMillen, F. Isaacs and J.J. Collins (2001): Computational Stud-
ies of Gene Regulatory Networks: In Numero Molecular Biology. Nature Re-
views Genetics, VoI. 2, No 4, 268-279, April2001.
4 S.A. Kauffman (1995): At Home in the Universe: The Search for Laws of
Self-Organisation and Complexity. Oxford University Press, New York,
1995.
Mathematical Systems Biology 325

5 M. Hucka, A Finney, H. Sauro, H. Bolouri, J. Doyle and H. Kitano (2001):


The ERATO Systems Biology Workbench: An Integrated Environment for
Multiscale and Multitheoretic Simulations in Systems Biology. Chapter 6 in
Foundations of Systems Biology, H. Kitano (ed.), MIT Press, Cambridge,
Mass.200l.
6 P.J. Ramadge and W.M. Wonhan (1989): The Control of Discrete Event Sys-
tems. Proc. IEEE, VoI. 77, 81-98 (Special Issue: Discrete Event Dynamic
Systems).
7 X.-R. Cao and y.-c. Ho (1990): Models of Discrete Event Dynamic Systems.
lEEE Control Systems Magazine, VoI. 10, No. 3, 69-76.
8 K.-H. Cho and J.-T. Lim (1999): Mixed CentralisedIDecentralised Supervi-
sory Control of Discrete Event Dynamic Systems. Automatica, VoI. 35, No.
1, 121-128.
9 R. Rosen (1985): Anticipatory Systems. Pergamon, New York.
10 O. Wolkenhauer (2001a): Systems Biology: The Reincarnation of Systems
Theory Applied in Biology? Briefings in Bioinformatics, Henry Stewart Pub-
lications, VoI. 2, No. 3, 258-270, September 2001 (Special Issue: Modelling
Cell Systems)
Il J.L. Casti (1988): Linear Metabolism-Repair Systems. Int. J. General Sys-
tems, VoI. 14, 143-167.
12 O. Wolkenhauer (2001b): Mathematical Modelling in the Post-Genome Era:
Understanding Genome Expression and Regulation - A System Theoretic
Approach. BioSystems, Elsevier, Amsterdam.
13 J.L. Casti (1988b): The Theory of Metabolism-Repair Systems. Appl.
Mathematics Comput., VoI. 28,113-154.
14 J.c. Willems (1991): Paradigms and Puzzles in the Theory of Dynamical
Systems. lEEE Transactions on Automatic Control, VoI. 36, No. 3, 259-294,
March 1991.
15 O. Alter, P.O. Brown and D. Botstein (2000): Singular Value Decomposition
for Genome-Wide Expression Data Processing and Modeling. PNAS, VoI.
97, No. 18,10101-10106,29 August 2000.
16 H. Kitano ed. (2001): Foundations of Systems Biology. MIT Press. Cam-
bridge, Mass.
17 B. Schoeberl, C. Eichler-Jonsson, E.D. Gilles and G. Mliller (2002): Compu-
tational Modelling of the Dynamics of the MAP kinase Cascade Activated by
Surface and Intemalised EGF Receptors. Nature Biotechnology, VoI. 20,
April, 370-375.
18 AR. Asthagiri and D.A Lauffenburger (2001): A Computational Study of
Feedback Effects on Signal Dynamics in a Mitogen-Activated Protein Kinase
(MAPK) Pathway Model. Biotechnol. Prog., VoI. 17,227-239.
What Kinds of Natural Processes can be
Regarded as Computations?

C. G. Johnson

Computing Laboratory, University of Kent at Canterbury, Canterbury,


Kent, CT2 7NF, UK
C.G.Johnson@ukc.ac.uk

Abstract. This chapter is concerned with how computational ideas can be used as
the basis for understanding biological systems, not by simulating such systems, but
by taking a computational stance towards the way such systems work. A number of
issues are addressed. Firstly the question of what kinds of computer science are
needed to help understand computational processes which happen outside of con-
ventional computing machines. The second issue addressed places computational
constraints on how the world can act into Dennett's framework of grades of possi-
bility. The final main section considers the issue of changes in the world, and when
it is meaningful to regard such changes as carrying out computations.

1 Introduction

[n recent years the idea of using computational concepts as a way of understanding


biologic al systems has become of increasing importance; this conceptual use of
computational ideas should be contrasted with the equally valuable activity of us-
ing computers as tools for interpreting biologic al data and simulating biological
systems. This computational attitude towards biological systems has been valuable
in computer science itself, too; by observing how biological systems solve prob-
lems, new algorithms for problem solving on computers can be developed.
The aim of this chapter is to tease out some details of how ideas from comput-
ing can be used to inform thinking about biological questions, and vice versa. In
keeping with the theme of the book an attempt is made to use ideas from cellular
and tissue-level biology. The following questions indicate the main issues ad-
dressed:
• What kind of computer science is needed to answer biologic al questions?
• What does computational complexity mean when computing is grounded in the
physical world?
• Does computation place limits on what sort of thing is possible in the world, and
how does this fit in with other ways of assessing possibility?
• What does the ability of computers to simulate or not be able to simulate a sys-
tem say about those systems?
328 C. G. Johnson

• What kinds of transformations in the world can be regarded as being computa-


tions; and which transformations can be thought of as not being computations?

2 Computer Science or Computer Science?

'Who could believe an ant in theory? A giraffe in blueprint? Ten thousand doctors of what' s
possible could reason half the jungle out of being.' John Ciardi [1]
It is a tired cliche of popular psychology that we only use 10% of our brains. It
is unlikely that this is true, but it is interesting to consider how a statement like this
might be interpreted. Is this a physical statement? Could we cut out the 90% that
isn't being used, throw it away, and stiH function normally? Is this a biological
question, meaning that we are only using 10% of the available neuronal pathways
or only activating 10% of the signals that we could? This is perhaps closer, but stiH
not ideal. Does it mean that we could store 10 times as much 'stuff' if we were
working at fuU capacity? Think 10 times as quickly? These are stiH fairly ill-
defined questions, but they conform to the intuitions which people have about the
brain, aud they are at heart computational questions. They are questions about the
capacity of an entity for the storage of information and its ability to process that in-
formation.
The point of this story is to illustrate that we already think about biologic al
processes in terms of computational ideas, even if in an informal way. It is not sur-
prising to find that we think about the brain in this way, given both the popular
view of brains as being essentially computers and the view dating back to the early
days of computing of computers as 'electronic brains'. However, it is only a short
step from this to start thinking about other parts of the body (whether at the tissue
level or the cellular level) in computational terms.
Many cellular systems have an information processing aspect. The immune sys-
tem is weU studied from this perspective [2, 3], and there is potential to view the
signal transduction system in this way. What kinds of computer science are needed
to help understand these kinds of system?

2.1 Complexity of Natural Computation

One example of a piece of computer science theory that could provide a tool for
the understanding of natural computation is a theory of complexity. If we consider
computation to be something that is grounded in the world, then how does that in-
f1uence our view of computational complexity? What kinds of complexity in nature
arise out of the presence of computations in natural systems? Clearly we can al-
ways define certain formal system& as being what we mean by the term computa-
tion, and then derive/define certain measures of complexity with respect to this
definition. However, if we want to apply computational ideas in the context of ana-
lysing transformations of the world then we might want to not ground these in par-
ticular axiomatisations of computation, as we might not be able to show that the
physical system conforms to that axiomatisation.
What Kinds of Natural Processes Can Be Regarded as Computations? 329

An interesting example of this is protein-folding, where a linear string of pro-


teins can form a three-dimensional structure in a very short amount of time [4].
The number of possible configurations that this three-dimensional structure could
take, and the problem of calculating this structure is computationally hard (e.g. it
has been shown to be NP-complete [5, 6]). How, therefore, does the protein 'com-
pute' its configuration on a realistic timescale? This is a well-known problem in
theoretical biology, known in less formal terms as Levinthal's paradox [7]. It may
be the case that there is little to explain here; whilst the process happens quickly, it
may be that the process is sirnply do ing conventional computation very quickly,
and if we were to be able to measure the timescales on which the folding was hap-
pening with accuracy there is sufficient time to do enough operations. However, if
it is not, we are presented with the in tere sting question of how the system proc-
esses the information sufficient1y quickly to produce the result. Is it exploiting
some property of the world which we do not use in building conventional com-
puters, and which we do not therefore incorporate into our conventional models of
computing? Is it exploiting computation in some way which means that we cannot
do it on conventional computers (e.g. using some form of high-density parallelism
in which small parts of the system can be considered as doing local computations
from which the global structure emerges)? Or are we wrong to assert in the first
place that if something changes in a way such that we can measure its computa-
tional complexity then it is necessarily doing the problem in a computational way.
A similar problem occurs with mathematical models. We can demonstrate that a
certain complex set of differential equations models the turbulent motion of a seed
blowing around in the wind. However, a bird moving to catch such a seed doesn't
need to solve those equations in order to catch the seed, nor does the seed need to
be aware of the equations in order to carry out the movement. Perhaps the problem
is simply a confusion between a description of the system and the system itself.

2.2 Simulation of Natural Systems

An interesting perspective on the relationship between natural systems and compu-


tational systems is considering the idea of simulating the system in question on a
computer. One of the most interesting results to come out of such a thought is the
original work on quantum computing by Feynman. His original idea about quan-
turn computing carne from considering the idea of sirnulating quantum physics on
computers [8]:
'the full description of the quantum mechanics for a large system [ ... ], because it has too
many variables, cannot be simulated with a normal computer [... ]. And therefore, the prob-
lem is, how can we simulate the quantum mechanics? There are two ways we can go about
it. We can give up on our rule about what the computer was, we can say: Let the computer
itself be built out of quantum mechanical elements which obey quantum mechanical laws.
[ .. .]'

We can generalize this idea to ali natural systems as follows. Given any system
in the world and some idea of what we mean by a computer either we can simulate
it on the computer or not. Thcre are two variants of this. In the first we consider
what can be simulated at aII. For example we cannot accurately simulate most non-
330 C. G. Johnson

discrete systems using a computer with a finite memory (we can clearly simulate
some such systems, such as relationships between two intervals on the real line
which can be described by a finitely describable function). This is regardless of the
amount of time we take; even given any finite number of timesteps we cannot even
represent the initial state of the system exact1y. Other systems admit an inefficient
simulation. For example a problem like factoring composite integers is hard (in the
technical sense) on conventional computers, yet proposed quantum computers [9]
provide a technology on which polynomial-time algorithms for the same problem
can be executed.
The consequence of this is that if we cannot simulate the system (efficient1y, or
at all) on the computer then theoretically there is a property of the world which can
be used as a substrate for computation. Clearly whether the particular property ad-
mits its use for computing artificially created problems will vary from case to case.
In particular a significant feature is whether the system admits control over its in-
puts; many computations are happening in the natural world which cannot take in-
puts other than the ones which they receive as part of a larger system. Therefore we
cannot say that merely observing the act of computation in a natural system pro-
vides the core of a practical computation system.

3 Grades of Possibility
There seem to be at least four different kinds or grades of possibility: logical, physical,
biological, and historical, nested in that order. Daniel Dennett [10]'
Does computation have a role to play in explaining what is possible in the
world? It has been suggested by Dennett [10] that there is a hierarchy of 'grades of
possibility' of things in the world. He suggests the fOllowing as such a hierarchy
(with the possibility of other grades being added):

• Logical
• Physical
• Biological
• Historical
In order for something (a biological something, that is) to actually exist in the
world, it has to be possible at all levels. However, given a putative object which
does nof exist in the world, that non-existence can be explained at one of the levels.
Some things are not logically possible, for example an object which simultane-
ously exists and doesn't. In order to explain the impossibility of such an object a
logical explanation suffices; it is not necessary to go further in the hierarchy to ex-
plain the impossibility of such an object. Just because a putative object contains
characteristics which associate it with a point on the hierarchy doesn't necessarily
place it there; lOm high ants would be biological objects, but it is not necessary to
go as far as biological in the hierarchy to explain their absence in the world; phys-
ics will do that for us. Therefore for every putative object each of those stages can
be examined and whether it is possible at that level or not determined. Broadly any
object placed at one level must also be possible in the previous level, though there
What Kinds of Natural Processes Can Be Regarded as Computations? 331

are complexities, in particular the 'historical' level can contain objects which are
physically possible but which have no biologic al aspect, so it is impossible to place
them meaningfully in or out of the biological category. There are a number of ways
to deal with this consistently, e.g. allowing non-biologic al objects through the bio-
logical layer without being classified or branching the hierarchy into 'biologica\'
and 'non-biological' branches. The final stage in the hierarchy is concerned with
what has actually happened; thus it is necessary to fix a particular point in time be-
fore it is possible to make statements about that final point in the hierarchy.
It is an interesting thought-experiment to take each of the grades in the hierarchy
and think of some putative object which is impossible because of precisely that
reason, i.e. the reason 'earlier' in the hierarchy admits it, whilst the current reason
is sufficient to dismiss it without needing to go further into the Iist.

3.1 Computational Possibility

Might it be reasonable to introduce a new grade into this hierarchy, computational


possibility? Such a grade would include items which are possible because they re-
quire computation to be carried out in order for them to exist, and they require that
that computation would be feasible given the computational resources available in
the system. Where would such a grade of possibility go in the hierarchy? To avoid
the complications about non-biologic al objects discussed above, let us restrict our-
selves to biological systems only. Firstly let us try to introduce a computational
grade between 'biologica!' and 'historical'. What might be an example of some-
thing which is biologically plausible but computationally impossible (or effectively
impossible). One example might be a type of asexually reproduc ing bacteria (or
any other kind of asexually reproducing creature) which are genetically identical
from generation to generation. The reason this is implausible is because of the im-
perfection of the information-transmission process from generation to generation;
as mutations occur and get pas sed onto future generations, so the initial genomic
uniformity gets broken down. Clearly this could be regarded as a biologic al prop-
erty; but by the same token all the biological properties could be regarded as
physical in origin. What we are trying to do is to refine the space between the two
extremes of the hierarchy.
Where else might 'computational' be placed in the hierarchy? If computational
is placed between 'physical' and 'biologica!' we are concerned with computational
systems which can be realized in the physical world yet which cannot be imple-
mented by biologic al systems. It is hard to think of a non-trivial example. It might
be reasonable to assert that biology places constraints on the size of creatures, and
therefore on the amount of information which they can store; therefore some com-
puter systems could be physically created which wouldn't be capable of biological
realization. This seems to be an unsophisticated example, however. As we discover
more about the mechanisms by which biologic al systems compute, we may find
more things in this category.
Another possibility would be to place 'computationa!' between 'logica!' and
'physica!'. This would suggest that there are computational constraints on the laws
of physics; such an idea has been occasionally explored by theoretical physicists
332 C. G. Johnson

[11, 12]. Exploration of this idea would take us too far from our main discussion
here.
It may be that 'computational' can be meaningfully placed at a number of points
in the hierarchy, and that these different placements give a taxonomy of different
kinds of computational phenomena.

4 Can a Change not be a Computation?


Does a rock compute every finite-state automaton?' David Chalmers [13]
In the above discussion we have been considering the consequences of consider-
ing certain actions in the natural world to be computations. In this section the ques-
tion is reversed. Consider the following question: are there any transformations in
the natural world which we can not meaningfully regard as being computations?
By considering some action in the world to be a computation, a number of ques-
tions arise about that action and the system in which it occurs: how is information
stored in the system? What is the scope of transformations which can be made to
that information? How does the complexity of doing that transformation place con-
straints on what the system can do on a particular timescale?
If we want to stop thinking of computing as just something which happens in
machines inside beige boxes containing electrical circuits and consider it to be a
property of natural systems such as cellular systems then we need to decide where
to stop. There would seem to be a danger in using the term 'computation' exces-
sively to the point where it just becomes synonymous with 'change' or 'transfor-
mation'. Given the set of aU possible transformations which can happen in the
world (or the particular part of the world WhlCh we are interested in, e.g. the cellu-
Iar world), to which of them do we want to ascribe the labeI 'computation'. On a
triviallevel we are free to use this work in any way we want, so perhaps we should
refine the question somewhat. A better version might be this: given the set of trans-
formations which can happen in the world, how can they be divided into 'computa-
tions' and 'non-computations' in a way which respects the essential properties of
computation in machines. The difficulty here is with the word 'essential'; given
that we are attempting to extend a concept away from the domain in which it was
originally defined, we must let go of some ideas, otherwise there would be no
problem. In the rest of this section I would like to consider a number of features
which might help to make a useful distinction.

4.1 Observability

Is the notion of a change in the world being observed essential to the idea of in-
cluding it in the set of computations? This idea can be unpacked in two directions.
Firstly it doesn't seem essential that the computation itself be observable, only
that the inputs and outputs be. In normal computing, we are happy with the idea
that a user of a system (whether that user is a human or another computer system
making an automated enquiry) interacts with the system by specifying input and in
turn receives output; they do not need to see the states which the machine takes on
What Kinds of Natural Processes Can Be Regarded as Computations? 333

in between. Indeed it seems natural to extend the idea of computation to those sys-
tems where the changing state cannot be observed without disturbing the process,
as in quantum computing.
Secondly we can concentrate on the question of what is doing the observing. It
does not seem necessary to restrict the observer to being a conscious entity; it
would seem reasonable to suggest that in a multi-component system, one compo-
nent can carry out a computation and pass its output onto another. It may be the
case that a system can be self-observing.
The aim of considering observability is to attempt to exclude those parts of the
world which are changing but not affecting other parts; however, this does not
seem to be a significant part of 'computing'. Whilst transformations happen in the
world without being observed (in the broad sense of passing their data onto another
system), it does not seem that we should exclude these from what we regard as
computations, or that this is a significant distinction (it is akin to 'when a tree falls
in the woods, does it make a sound?' - entire1y dependent on definition).

4.2 Consistent Ascribing of Symbols

An important characteristic of computing is that symbols within the system have a


consistent interpretation throughout the computation, Of at least if they do not there
is a component of the system which explains how the interpretation of the symbols
changes as the computation progresses. That is, any external system which ob-
serves and/or initiates a computation must declare in advance how it is going to in-
terpret those symbols. This seems to be a key characteristic of computing which
can be applied to natural systems.
If there is not a consistent allocation of symbols then transformations are mean-
ingless. In particular if we are completely free to assign any symbol to any mean-
ing at any point in the computation then we can say that any transformation is do-
ing any computing (subject to certain restrictions on the number of bits being
transformed). This is akin to the 'can a rock implement every finite-state automa-
ton' argument [l3, 14]. If we take a trivial 'transformation' of a system (Le. one in
which nothing changes as a result of the transformation) and we are free to change
the interpretation of the symbols, then we can just 'relabel' the unchanged symbols
in terms of the desired output; we would presumably not want to ascribe the prop-
erty of computation to that trivial non-transformation.
It seems that many biologic al systems to which we want to ascribe the idea of
computation support this idea. The output from a computation on a traditional
computer passes a stream of bits to a screen or printer which are interpreted in a
consistent way so as to display a particular text or image. In biological cells the
end result of a sequence of signal transduction steps on receipt of a particular re-
ceptor is a particular protein; in protein folding a particular amina acid sequence
gives rise to a particular three-dimensional structure (or one drawn from a particu-
lar probability distribution of structures). It is important to make a distinction be-
tween consistent and deterministic here; this property does not exclude probabilis-
tic actions being included in computations.
334 C. G. Johnson

4.3 Digital Encoding

Is a digital encoding of information necessary in order to call some transformation


a computation? Many discussions of computing assume that digital information is
at the heart of computing, and the fact that the genetic system is digital is often
seen as one of the core arguments for evolution and development having computa-
tional aspects to them. It is possible, however, to construct computational devices
out of non-digital components, and to construct algorithms which make use of ana-
logue representations of information; indeed in the early development of comput-
ing digital and analogue approaches to computation were developed alongside each
other. If we are to think of computing as something which occurs in a wider variety
of systems, it would seem that we shouldn't take the presence of a digital represen-
tation of information as a key factor in deciding whether a system is computational
or not. Indeed in many cellular systems the structures for representing information
digitally do not seem to exist.

4.4 Flexibility of Inputs

Another factor which we may want to take into account in developing a distinction
between computation and non-computation is the flexibility that an external system
has to change the input. An important characteristic of computing is that computers
act on different data; they don't just do the same action aH the time. Still important,
though perhaps less core to the idea of computing, is the idea of programmability.
The ability to influence the system by adding new information would seem to be a
core idea in ascribing an idea of 'computing' to an action in the world. Again this
is well illustrated by the protein folding problem; one of the reasons that we can
easily apply computational reasoning to understanding that problem is that we can
put information into the system as symbol strings, and the range of inputs is vast.

4.5 Intention to Initiate a Change

A final property we shall consider here is whether the intention to do a computa-


tion is a significant factor in deciding which natural transformations should be re-
garded as computations. As with the idea of observability above, intention here
need not mean conscious intention. However, it seems important when ascribing
the notion of computation to an action that it be triggered by some other system (or
by itself, but in a more sophisticated way than just existing in a changing state)
with the end result that the output from the system will act in the world. By making
this distinction we draw a line between those transformations which are part of
some system which is acting to effect a deliberate change on the world, and those
which are just happening because the laws of physics are acting on certain pieces
ofmatter.
What Kinds of Natural Processes Can Be Regarded as Computations? 335

4.6 Summary

Clearly we could consider other properties. However, it seems that we are begin-
ning to tease out what we might mean by a 'computation', and what transforma-
tions we might ascribe to the category 'not a computation'. Clearly this is a topic
around which much future discussion could revolve.
In particular it is interesting to speculate whether it is only a historical happen-
stance that our first encounter with the concept of computation was through the
synthetic creation of computing devices? Could we instead have come across some
of the important ideas in computing by an analytic study of biologic al systems? If
so, which concepts would have most easily been discovered through such an ana-
lytic study? What systems, and what commonalities between systems, might have
suggested these concepts? Are there other concepts, now regarded principally as
analytic scientific concepts, which were originally discovered as synthetic engi-
neering ideas? If so, how did the transition occur from the concept being seen as
purely synthetic to it being seen as a scientific concept which could be applied in
an analytic fashion to understand objects in the natural world? If we believe that
this is important, how do we go about encouraging such an enrichment of ideas in
the context of concepts from computing?
How can we revisit computational ideas without being overly distracted by the
kind of computation that we see on computational devices? What would computer
scientists be studying if the computer had not been invented?

References

1. J. Ciardi. The Collected Poems of John Ciardi. University of Arkansas Press, 1997. ed-
ited by Edward M. Cifelli.
2. S. Forrest and S.A. Hofmeyr. Immunology as information processing. In L.A. Segel
and 1. Cohen, editors, Design Principles for the Immune System and Other Distributed
Autonomous Systems. Oxford University Press, Oxford, 2000.
3. A. S. Perelson and G. Weisbuch. Immunology for physicists. Reviews of Modem Phys-
ics,69(4): 1219-1267,1997.
4. A. Fraenkel. Complexity of protein folding. Bulletin of Mathematical Biology, 55(6):
1199-1210,1993.
5. A. Berger and T. Leighton. Protein folding in the hydrophilic-hydrophobic (HP) model
is NP-complete. Joumal ofComputational Biology, 5(1): 27-40,1998.
6. P. Crescenzi, D. Goldman, C. Papadimitriou, A. Piccolboni, and M. Yannakakis. On
the complexity of protein folding. Journal of Computational Biology, 5: 423-465,
1998.
7. C. Levinthal. How to fold graciously. In J. T. P. DeBrunner and E. Munck, editors,
Mossbauer Spectroscopy in Biological Systems: Proceedings of a meeting held at AI-
lerton House, Monticello, Illinois, pages 22-24. University of Illinois Press, Chicago,
III., 1969.
8. R. P. Feynman. Simulating physics with computers. International Journal of Theoreti-
cal Physics, 21: 467-488, 1982.
336 C. G. Johnson

9. C. P. Williams and S. H. Clearwater. Explorations in Quantum Computing. Springer,


Berlin Heidelberg New York, 1998.
10. D. Dennett. Darwin's Dangerous Idea: Evolution and the Meanings of Life. Penguin,
1995.
Il. J. Schmidhuber. A computer scientist's view of life, the universe, and everything. In
C. Freksa, M. Jantzen, and R. Valk, editors, Foundations of Computer Science: Poten-
tial - Theory - Cognition, pages 201-208. Springer, Berlin Heidelberg New York,
1997. Lecture Notes in Computer Science.
12. J. Schmidhuber. Aigorithmic theories of everything. Technical Report 20-00, IDSIA,
2000.
13. D. J. Chalmers. Does a rock implement every finite-state automaton? Synthese, 108:
309-333, 1996.
14. H. Putnam. Representation and Reality. MIT Press, 1988.
List of Contributors

Amos,M.
School of Biological Science and Engineering and Computer Science,
University of Exeter, Exeter EX4 4JH, United Kingdom

Bolouri, H.
Institute for Systems Biology, Seattle, W A 98103-8904, and Division of
Biology, 156-29 California Institute of Technology, CA 91125 USA

Brown, D.
School of Biomedical and Clinica! Laboratory Sciences,
University of Edinburgh, Hugh Robson Building, George Square,
Edinburgh EH8 9XD, United Kingdom

Brown, R
Mathematics Division, School of Informatics, University of Wales,
Bangor, Gwynedd LL57 1UT, United Kingdom

Bull, L., Tomlinson A.


FacuIty of Computing, Engineering and Mathematical Sciences,
University of the West of England,
Bristol BS16 lQY, United Kingdom

de Castro, L. N.
School of Electrical Engineering and Computing, State University of
Campinas, 13081-970 Campinas, Sâo Paulo, BraziI

Cho, K.-H.
School of Electrical Engineering, University of Vlsan,
Vlsan, 680-749 South Korea

Feng, J.
COGS, Sussex University, Brighton BNI 9QH, and Newton Institute,
Cambridge University, Cambridge CB3 OEH, United Kingdom

Fisher, M. J., Saunders, 1.


School of Biological Sciences, University of Liverpool,
Liverpool L69 3BX, United Kingdom

Gregory, R, MaIcolm, G., Paton, R C.


Department of Computer Science, University of Liverpool,
Chadwick Building, Peach Street, Liverpool L69 7ZF, United Kingdom
338 List of Contributors

Hart, E.
School of Computing, Napier University
Edinburgh, Scotland, United Kingdom

Holcombe, M.
Department of Computer Science, University of Sheffield,
Regent Court, Portobello Street, Sheffield SI 4DP, United Kingdom

Johnson, C. G., Knight, T.


Computing Laboratory, University of Kent at Canterbury
Canterbury, Kent CT2 7NF, United Kingdom

Kolch, W.
Institute of Biomedical and Life Sciences, University of Glasgow
CRC Beatson Laboratories, Garscube Estate,
Switchback Road, Glasgow G61 lBD, United Kingdom

Leng,G.
School of Biomedica1 and Clinical Laboratory Sciences,
University of Edinburgh, Hugh Robson Building, George Square,
Edinburgh EH8 9XD, United Kingdom

MacGregor, D. J.
School of Biomedica1 and Clinical Laboratory Sciences,
University of Edinburgh, Hugh Robson Building, George Square,
Edinburgh EH8 9XD, United Kingdom

McNeil, C. J., Snowdon, K. 1.


Institute for Nanoscale Science and Technology,
University of Newcastle upon Tyne,
Newcastle Upon Tyne NEI 7RU, United Kingdom

Monk,N.
Centre for Bioinformatics and Computational Biology,
Division of Genomic Medicine, University of Sheffield,
Royal Hallamshire Hospital, Sheffield S 10 2JG, United Kingdom

Nagl, S. B.
Department of Biochemistry and Molecular Biology,
University College London,
Gower Street, London WClE 6BT, United Kingdom

Parish,1.H.
School of Biochemistry and Molecular Biology, University of Leeds,
Leeds, LS2 9JT, United Kingdom
List of Contributors 339

Porter, T.
Mathematics Division, School of lnformatics, University of Wales,
Bangor, Gwynedd Ll57 1UT, United Kingdom

Sant, P.
Department of Computer Science, King's College
London WC3R 2LS, United Kingdom

Schilstra, M.
Science and Technology Research Centre, University of Hertfordshire,
College Lane, Hatfield, HertfordshireALlO 9AB, United Kingdom

Tateson. R.
Future Technologies Group, lntelligent Systems Lab,
BT Exact Technologies, PPlIl2 Orion Building, Adastral Park,
Martlesham, Ipswich IP 3RE, United Kingdom

Timmis, J.
Computing Laboratory, University of Kent
Canterbury, Kent, CT2 7NF, United Kingdom

Tyrrell, A.
Bio-inspired Architectures Laboratory, Department of Electronics,
University of York
York YOlO 5DD, United Kingdom

Warner, G. 1.
Unilever Research Colworth
Colworth House, Sharnbrook, Bedford MK44 ILQ, United Kingdom

WU, Q. H.
Department of Electrical Engineering, University of Liverpool
Liverpool L69 3BX, United Kingdom

Wolkenhauer, O.
Department of Biomolecular Sciences
and Department of Electrical Engineering and Electronics,
Control System Centre, UMIST, Manchester M60 IQD, United Kingdom
Index

abelian 283 corporation 39


algebra 253, 277, 290 COSMIC 161
agent 262, 277 cybernetics 306
analogy 73, 94
Davidson 153
animat 40
de Castro 51
antibody 55
decentralized approach 9
ants 10,262
decision theory 185
Amos 269
delta 15,212
artificial immune system (AIS) 51, 107
deIta-notch signaling 215
autonomy 14,262
design 1 1, 17
automata 252
diffusion 219
(see also machine)
dopamine 230
bacteria 21,151,161 Drosophila 13, 15
Bersini 51
E. coli 14, 161
bioinformatics 125
Ehresmann 292, 308
biomedicine 140
Eilenberg 254, 290
blastocyst 18
electrophysiology 231
Bolouri 149
embryonics 93, 111
Brown, D. 227
embryogenesis 18
Brown, R. 289
emergence 14
Brownian motion 198
error detection 109
Bun, L. 27
evolution 10, 12, 138, 162

category theory 278, 286, 289 explicit binding model 214

CATH 130
fault diagnosis 78
ceH 12, 98, 264, 306
fault tolerance 11, 107
ceHlineage 175
Feng 185
cellsim 20
Feynman 117,329
Cho 305
field programmable gate array 97,97
ciliate 269
finite state machi ne 106, 252
cIassification 126
Fisher 277
clonal selection 62
fitness 39
coli mit 279, 296
function 121, 127, 134, 277
communication network 10
communicating x-machi ne 259 'glue' 134,280
complex systems 135, 138,328 GH releasing hormone 229, 234
GH model, 237, 240
342 Index

gene expression 135, 149, 176 Krohn252


gene rearrangement 272 lateral inhibition 215
gene structures 270 leaming classifier system (LCS) 28, 37, 278
genomics 121, 135 Leng 227
genetic algorithm 29 ligand binding 214
genetic regulatory network 149 ligation 270
Goguen 287, 296 Iymphocyte 54
Gregory 161
McNeill17
grid computing 173
MacGregor 227
growth hormone (GH) 227, 234
machine 251, 327
Hart 51 machine leaming 27,36, SI, 68,198,202
Hasse diagram 292 macromolecules 126
hierarchy 251, 264, 289, 292, 308 macronucleus 270
Holcombe 251 Malcolm 277
Holland 38 MAP Kinase 277, 282, 317
hormone 227 Markov 44
hybrid machines 260 Maude 285
hypothalamus 229, 233 metabolism 13
metadynamics 61
immune memory 56
metaphor 52, 66,126,131,142
immune network 54, 57
micronucleus 270
immune system 51, 328
Monk 211
immunology SI
morphism 290
immunotronics 104, 112
morphogenesis 17
implicit binding model 214
MRNA 151
independent component analysis 185
NANOMED 117
individual based modeling 278
nanotechnology 188
information processing proteins 132, 280
Nagl125
informax 188, 197
Nash equilibrium 34
integrate - and - fire 186, 189
neural network 143
integration 307
neuron 185,229
Jacob-Monod 166, 306 neuroendocrine system 227
Jeme 51 NKCS model 28
Johnson 327 noise 186
juxtracrine 211 notch 15,212

Kauffman 28,153 OBJ 285


Kolch 305 observability 333
Knight 51 operon 166
Krebs cycle 253 optimization 90
Index 343

organization 306 signalling 212, 280, 315


Oxytricha 269 Snowdon 117
somatostatin 229, 234
Parish 125
splicing 270
parallel distributed processing 138
stochastic 185
Paton 1, 125, 161,277,289
supervised leaming 202
pattern formation 212, 215
SWISS-PROT 129
Perelson 51
symbiosis 28
pituitary 227, 234, 237
symbiogenesis 28
pheromone 10, 262
symbol333
Porter 289
systems biology 139,308,314
protein 127, 130,277
post-genomics 135 Tateson 9
Poisson process 189 telecommunication 9,14
PSD 129 telephone networks Il
therapeutic delivery system 122
rat 228,
Thomas 153
recombination 270
Timmis 51
reliability 105
tissue 266
regulation 151, 306
Tomlinson 27
reinforcement learning 37
transcription 150, 166
rewriting 286
transcription factors 133, 151
robotics 76
traveling front 218
Rosen 290, 309, 316
Turing 254
Rozenberg 273
Tyrre1l93
Sant 269
unsupervised learning 203
Saunders 161
scaffold protein 282 Varela 51
scaling 310 Velcro 9
scanning tunneling microscope 117
Schilstra 149 Wamer 125
scrambled genes 271 Wilson 37, 40
SCOP 130 Wolkenhauer 305
sea urchin 154 Woods40
secondary messenger 281 Wu 161
self-assembly 121
x-machine 255
self/non-self 63, 107
self-organization 64, 308 yeast 277
semigroup 254
ZCS37
shape space 59
Natural Computing Series
W.M. Spears: Evolutionary Aigorithms. The Role of Mutation and Recombination.
XlV, 222 pages, 55 figs., 23 tables. 2000
H.-G. Beyer: The Theory of Evolution Strategies. XIX, 380 pages, 52 figs., 9 tables. 2001
1. Kallel, B. Naudts, A. Rogers (Eds.): Theoretical Aspects of Evolutionary Computing.
X, 497 pages. 2001
G. Păun: Membrane Computing. An Introduction. XI, 429 pages, 37 figs., 5 tables. 2002
A.A. Freitas: Data Mining and Knowledge Discovery with Evolutionary Aigorithms.
XlV, 264 pages, 74 figs., 10 tables. 2002
H.-P. Schwefel, 1. Wegener, K. Weinert (Eds.): Advances in Computational Intelligence.
Theoryand Practice. VIII, 325 pages. 2003
A. Ghosh, S. Tsutsui (Eds.): Advances in Evolutionary Computing. Theory and
Applications. XVI, 1006 pages. 2003
1.F. Landweber, E. Winfree (Eds.): Evolution as Computation. DIMACS Workshop,
Princeton, ]anuary 1999. XV, 332 pages. 2002
M. Hirvensalo: Quantum Computing. 2nd ed., XI, 214 pages. 2004 (first edition
published in the series)
A.E. Eiben, ].E. Smith: Introduction to Evolutionary Computing. XV, 299 pages. 2003
A. Ehrenfeucht, T. Harju, 1. Petre, D.M. Prescott, G. Rozenberg: Computation in Living
Cells. Gene Assembly in Ciliates. XIV, 202 pages. 2004
1. Sekanina: Evolvable Components. From Theory to Hardware Implementations.
XVI, 194 pages. 2004
G. Ciobanu, G. Rozenberg (Eds.): Modelling in Molecular Biology. X, 310 pages. 2004
R.W. Morrison: Designing Evolutionary Algorithms for Dynamic Environments.
XII, 148 pages, 78 figs. 2004
R. Patont, H. Bolouri, M. Holcombe, ].H. Parish, R. Tateson (Eds.): Computation in Cells and
Tissues. Perspectives and Tools of Thought. XVI, 358 pages, 134 figs. 2004
M. Amos: Theoretical and Experimental DNA Computation.
Approx. 200 pages. 2004

You might also like