Download as pdf or txt
Download as pdf or txt
You are on page 1of 8

K)'berrt.iet O T h a l e s P u b h c a t i o n s( W . O . ) L t d .

1 9 8 1 .\ ' o l 1 ( l n n I SQ*16K P r i n r e di n G r e a t B r i t a i n

BEAGLE-A DARWINIAN APPROACH TO


PATTERN RECOGNITION

RICHARD FORSYTH
Departnrent
of Mathemutics.Poly,tecluic
o-fir,iorth
London,
Hollowat,Roatl,Lontlonr\'7SDB(U.rK.)

( R e c e i r e dD e c e n t b e r1 7 . 1 9 8 0 )

"l -l : r e is gr andeu ri n ti ri s I' i e w o f l i fe . w i th i ts s everalpow ers.havi ng been


ori gi nal l l ' breathed
b r lhc Cr eat or int o a fe w fo rn ts o r i n to o n e : a nd that w hi l st thi s pi anet has gone c,v' cl i ng on
a e.' r r ding t o t he fi x e d l a w o f g ra v i ty ,fro m s o s i rnpl ea begi nni ngendl essforms most beauti fui
a n d nr os t wonder fu l h a v e b e e n .a n d a re b e i n e evol ved."

C harl esD orv,i n-The Ori qi n of S pec.i es.

B I " i ' ; l ' E ( B i o l o g i c a iE v o l u t i o n a r y , ' l l g o r i t h m G e n e r a t i n gL o g i c a l E x p r e s s i o n s )r s a c o m p u t e r p a c k a g e f o r p r o d u c i ' g


d e c ; - . ' n - r u l e sb y i n d u c t i o nf r o m a d a t a b a s eI.t w o r k s o n t h e p r r n c i p l eo f " N a t u r a l i s t i cS e l e c t i o n i ' q ' h e r e br yu l e s t h a t f i r
t h c t . . , i ab a d l y a r e " k i l l e d o l T " a n d r e p l a c e db y " m u t a t i o n s "o f b e t t e r r u l e s o r b ; n e g , r u l e s c r c a t e db r " m a r i n ! " l \ \ o
b e t : : : l i d a p t e dr u l e s .T h e r u l e s a r e B o o i e a ne x p r e s s i o nrse p r e s e n t e b dy r r e cs t r u c t u r e s .
-1'::
s o f t u ' a r ec o n s i s t so f t w t - r P u s c u lp r o g r a m s , H E R B ( H e u r i s t i c E r , o l u t i o n a r _R
v -u l e B r e e d e r ) a n d L E . 1 I ( L o g i c a l
E r r r . . a l o r , ' { n d F o r e c a s t e r )H. E R B i m p r o l e s a g i r e n s t a r t i n g s e t o i r u l e s b l , r u n n i n g o v e r s e v e r a ls i n t g i a t e t g i e-ner-
2 t r r ' r . : I . E , 4 F u s e st h e r u l e s t o c l a s s i f -sva m p i e sf r o m a d a t a b a s eu ' h e r et h e c o r r e c t m e m b e r s h i pm a y n o t b e k n o r v n .
P r c , . : l : n a r yt e s t so n t h r e ed i f f e r e n td a t a b a s e hs a v eb e e nc a r r i e do u t - o n h o s p i r a la d n t i s s i o n (sc l a s s r n h g e a r t p a r i e n r sa s
d c a i i s o r s u r v i v o r s ) o. n a t h l e t r cp h y s i q u e( c l a s s i n gO l y m p i c f i n a l r s t sa s l o n g - d i s t a n c er u n n e r s o r i p r i n t e r s ) a n d o n
f t r o t l ' . r l ir e s u l t s( c a t e g o r i z i n gg a n t e si n t o d r a u ' s a n d n o n - d r a u ' s )
I t . r ; ' p e a r fsr o m t h e t e s t st h a t t h e m e t h o d w o r k s b e t t e r t h a n t h e s l . a n d a r d i s c r i m i n a n a t n a l y s i st e c h n i q u eb a s e dc - r na
l i n c . ' . t j i s c r i m i n a n ft u n c t i o n .a n d h e t i c et h a t t h i s l o n g - n e g l e c t eadp p r o a c hr v a r r a n t sf u r t h e r i n r . e s t i c a t i o n .

1 IN TRO DUCT I O N i sti c fl avour that w e mi ght cal l " know l edeefarm-
i ng" or perhaps" sophi cul ture" .Ir-rparti cul ar i t i s
Th i s r : por t des c r i b e s BE AGL E (Bi o l o g i c a l the author' scontenti onthat the great pri nci pl e of
Er,o l u ti o nar y , 4igor it h m G e n e ra ti n gto g i c a l Ex - natural sel ecti oni s a val uabl etool i n the stock-i n-
p re ssi o nsrli' hic his a c om p u te r s y s te mfo r p ro d u c - trade of thc consci enti ousknow l edgeengi neer(or
in-ed e cis iln- r ulesby in d u c ti o n l ro m a d a ta b a s e . farnrer).
I t u 'o rks r t n t he pr inc ip l e o f n a tu ra l -o r a t l e a s t The idea of systemstirat improve b), a compu-
n a tu ra l i s t r c - - s elec t ion Th . u s i t re p re s e n tsa w e a v - tati onal anal ogy rvi th survi val of the fi ttest has
ing -to g e t hc rof s t r and s i n th e th o u g h t o f th re e been pursued before,2-a but has lapsed from
g re a t 1 9t h- c ent ur yE ng l i s h me n ,Bo o l e , Ba b b a g e favour someu' hat si nce the pi oneeri ng spi ri t of
a n d Da rr i in. Cyberneticsrvas consolidatedinto the mature (?)
Wh i l e " k nowledgeen g i n e e ri n g "o r " k n o w l e d g e di sci pl i ne of A rti fi ci al Intel l i gence. S el fri dge' s
refining" is currentiy enjoying something of a " P andemol l i um" w as an earl y exampl es of a
vogue and has already begun to produce impres- system desi gned to contai n " the seeds of sel f-
s i v e r e s u l t s . lt h i s r e p o r t c o n t a i n sa p l e a n o t t o i mprovement" rvhi ch i nvoi ved. among other
parallel endeavour with a less mechan- thi ngs. repi aci ng " demons" u' hi ch di scri rni nated
:::'::,,' 159
160 R I C H A R DF O R S Y T H

poor ly am o n g th e i n p u t p a tte rn s th e y w ere sup- The datafi l econtai nsa " trai ni ng set" of s am ples T
pos ed t o d i s ti n g u i s hw i th n e w " d e m o ns" formed for which the categories are knorvn. It should L(

by r ando ml y a l te ri n g th e p a ra me te rso f survi vi ng begi nw i th tw o i ntegers,W and F' .W i s the widt h Lt

ones . B u t p ro b a b l y th e o n l y re a l l y thorough- i n charactersof the descri pti on fi el d fo r each c


going at te mp t to " b re e d " i n te l l i g e nce i n the sampl e(0 i f absent).F i s the number of fe at ur es.
abs t r ac tu ' a s b y Ba rri c e l l ia n d B e l l .6 Then follows the data-for each case the descrip- t.
LC

ti on fi el d oi W characters,F numbers w h ich ar e :-


B ar r ic e l l i ' s" s y m b i o o rg a n i s ms "w e re sequences l l

of int ege rsth a t e x i s te di n a u n i v e rs ec o l l si sti ngof measuresfor the case on each feature or variable l (

an ar r ay o f c e l l s .Wh e n e v e r tw o o rg a ni smsboth (i ntegersonl y at presentw i th at l eastone s paceor


at t em pt e dto e x p a n d i n to th e s a m e s pacea game new iine to separatethem). and lastly a number B
of Tac-Tix was played betrveenthem to the death. i ndi cati ng the actual categoryto w hi ch th at case t

T he num b e r p a tte rn so f th e o rg a n i s n t srverei nter- bel ongs.(The categorynumber must end a line. )
((
pr et ed a s mo v e s i n th e g a m e . T h e survi vi ng There follorvs the first three lines oi a tvpical
or ganis m sw e re a l l o w e d to re p ro d u c e(asexual l y. datafi l e. u
e
4 18
tl
5 1 7 6 8 1 6 5 r 2 1 1 48 8 9 5 1 3 r 1 t 4 1 6 6 1 r 5 2 2 5 I I 0 5 6 22 0 6 1 1 33 4 0 g
D

i t a p p e a r s )a n d s o m e r a n d o m m u t a t i o n s i n t r o - T h i s i s t h e b e g i n n i n go f a f i i e o f d a t a f r o m 1 1 3
duced,after rvhich the process\\rasrepeated.After pati ents admi tted to hospi tai w i th hear t com -
s om e t ho u s a n d so f g e n e ra ti o n sh e h ad a col i ec- p l a i n t s . TE a c h p a t i e n t w a s m e a s u r e do n 1 8 v a r i -
t ion of org a n i s m sti ta t w e re e x p e rta t Tac-ti x. B ar- abl eson admi ssi on.P recedi ngthe 18 scor esis an
r ic elli f ou n d rt q u i te a n e ffe c ti v ete c h ni que.and i t i dentrfi cati onnumber (4 characters)rvhrc h is 5'17
is m y v r e w th a t i t i s d u e fo r a re v tv a l . for thi s pati ent. Fol l orvi ng the scoresi s the cat e-
-qory number (1 : l i ved, 2 : di ed). These cases
2 BEAGLE-THE U S E R ' SV I E W rvere used for testi ng: see secti on4. (The f ir st 3
varrabl esare age. hei ght and sex; so thi s pat ient
T he s y s te ma s p re s e n tl f i m p l e me ntedconsi sts r v a s 6 8 y e a r s o l d . 1 6 5c m t a l l . a n d m a l e . . . . h e
o f t r v o P A S C A L p r o g r a m sr u n r r i n go n t h e D E C survived.)
S y s t e m - 1 0a t P o l l ' ' t e c h n i co f N o r t h L o n d o n , To enabl e the program to assesseach r ule's
nanr ely H E R B (Il e u ri s ti c E v o l u ti onarl ' R ui e performancethe user must al so furni sh a payof f
B r eeder ) a n d L E AF (L o g i c a l E v a l uator A nd nratrix in a separatefile u'hich effectivelystates
F or ec aste r).T h e y c a n b e a c c e s s e dl i k e any' other the val ue or cost of each cl assi fi cati on or misclas- (
s t at is t ica lp a c k a g e a n d i n fu n c ti o n correspond srfi cati on.The payoff fi i e al so i ndi cateshow m any tr
m os t c lo s e l y to d i s c ri m i n a n t a n a i y s i s.Together categori esare i n use.S i ncethe program rvor ks on tr

t hey per f o rm th e ta s k o f c l a s s i i y i n gs ampl esi nto tri -statel ogi c tt' l tere1 - )' es,0 : don' t-kn ow ar ld
one of t w o o r m o re c a te g o ri e so ri th e basi sof the - 1 - n o . t h i s m e a n sa 3 b y N C t a b l e * ' h e r e N C
v lilues of a n u mb e r o f m e a s u re so r parameters i s the number of cl asses. (Later rel eases' uv iliallou'
des c r ibi n g e a c h s a mp l e . H ER B c re a tes andi or the user to speci fyone of sevcralmul ti -sta t e! ogics
m odif iesth e c l a s s i fi c a ti o nru l e s w h i c h LE A F then o f w h i c h 0 . . 1 . B o o l e a n .w ' i l lb e a s p e c i a cl a s e . ) b
us es .t y p i c a l l y to fo re c a s tg ro u p m e m bershi pfor For the testson the hospi taladmi ssi ond at a t he I
s am ples ' n l ' h o scel a s si s n o t k n o w n . pavoff matrix \\'as as follorvs. t

2. 1 f / ie H E R B Pro g ra n t (
ActualClass
HE RB re q u i re sth re e i n p u t fi l e s -a datafi i e.a C omputer D eci si on 1 (lived) 2 (died)
pav of f f il e a n d a n o l d ru l e fi l e (p o s s i bl yempty).It - I (no) -1 +1
pr oduc e s a s o u tp u t a n e u ' ru l e fi l e u,hi ch i s as 0 (maybe) 0 0
qood as o r b e tte r th a n th e o l d o n e . I (1'es
) -fl
, i
-i
BEAGLE . tb l

Th u s a r ule gained a p o i n t fo r a c o rre c t c l a s s i fi - same format as the training set-the only differ-
ca ti o n air d los t one f o r a n i n c o rre c t o n e . Mo re ence bei ng that the actual cl assesneed not be
compler rervard/punishmentschedulesrvith more know n zero i ndi cati ng unknow n cl ass member-
classesa.reof course possible. ship-and runs a rule file on it. The user specifies
Fi n a l jr . t he us er s up p l i e sa n i n i ti a l ru l e fi l e c o n - horv many rul es to use: these are al w ays l eft
ta i n i n g up t o 64 r ules .I n i ti a l l y th e rem a y b e n o n e , ordered by H E R B w i th the best fi rst. LE A F can
in rvhicir case the program will generatesome at be requestedto produce: (1) a l i sti ng of al l cases
random w i th predi ctedcl ass.and actual cl assand scorei f
A r u i . ' i s r e p r e s e n t e db y a f u l l y b r a c k e t e d knorvn; (2) a summary of the performanceof each
Bo o l e a n c r pr es s ionen d e d b y a d o l l a r s i g n , s u c h r u l e a n d a l l t h e r u l e sj o i n t l y ' : ( 3 ) a n o r d e r i n g o f
AS casesby rul e consensusfrom most l i kel y Y es to
most l i kel y N o.
( ( * q G E t 0 ) o L ( ( # 4 L T 1 0 )A N ( * f l NE 0)))$
N oti ce that the rul es produced bv H E R B can
wh i ch Si. r ie S t hat v ar ia b l e4 (* q l s h o u l d e x c e e do r be appl i ed by a person. LE A F i s merel y a con-
eq u a l l i t or t hat bot h v a ri a b l e 4 s h o u l d b e l e s s veni ence.C ontrast thi s i vi th the l i ncar functi ons
t h a n 1 { r r n d v a r i a b l e1 7 n o t z e r o f o r t h e r u l e t o w i th coeffrci entsexpressedto 8 or 10 deci mal
g i v e a p r > r t i v e( t r u e )r e s u l t . pl acesoutput by conventi onaldi scri mi nantanal y-
Th e ,' : ' er at or s ar e as fo l l o w s . si s packages:no one i n thei r ri ght mi nd i voul d try
t o u s e t h o s ew i t h o u t m a c h i n ea s s t s t a n c e .
EQ ar it h me ti ce q u a l i ty
NE ar it h me ti ci n e q u a l i ty
GT greater than
3. HOW I{ ERB WORKS
LT les st h a n
GE gr ea te rth a n o r e q u a l to
H E R B a t t e m p t st o m i m i c e v o l u t i o nb 1 ' n a t u r a l
LE les st h a n o r e q u a l to
sel ecti on.Its " organi sms"are the rul es and thei r
OL logic a l d i s j u n c ti o n(i n c l u s i v eo r)
survi valdependson how w el l thei , categori zethe
AN logic a l c o n j u n c ti o n(a n d )
s a m p l e si n t l r e t r a i n i n g s e t .
NO negation
It runs for a number of generati ons.chosenby
PLt.S addition
t h e u s e r .A g e n c r a t i o nc o n s r s t o
sf one run through
L E- S S s ubtra c ti o n
t h e d a t a d u r i n g u , h i c h e a c h r u l e i s e v a l u a t e do n
BY m ulti p l i c a ti o n
everv case and scored accordi ng to the payoff
OVI.R division
matri .x.The rul es are then ranked by total score
(Th e o d. : nam ess uc h a s A N a n d O L rv e rec h o s e n * ' rth the best rul es at the top. i .e. thosc w ' i th the
to a vo rrl . r c las h r v it h P ,4 S C A L p re d e fi n e do p e ra - hi ghestscore.
to rs.) l -he scori ngformui a i s actual l l '

( ( G O O D N E S S - MI N S C O R E )x ( A X S C O RE - N {I N S C O RE ) - S I Z E
G F A C T O R ) , 'M

Ari th nr c t icis int egr a te dw i th l o g i c a l e v a l u a ti o n rvhere M IN S C OR E and MA X S C OR E are the


b e ca u s ct it e t hr ee t r ut h v a l u e sa re * 1 , 0 a n d - 1 . l orvestand hi ghestscorespossi bl e.GOOD N E S S
If a ru lc v ields a f ina l v a l u e o u ti s d e th e l o g i c i s the accumul atedpayoff and S IZE i s the si ze of
ra n g e i t * ill be t r unc a te d to th e n e a re s te x tre m e . the rul e measuredby counti ng nodes (terms or
A r i t h m c i c s u b e x p r e s s i o n sa r e n o t t r u n c a t e d subexpressi ons). W hat thi s means i s that a l ong-
(u n l e ssthel' would c au s eo v e rfl o w ). rvi ndedrul e scori ng the same as a ntore conci se
one rvi l l be ranked l ow er. R ememberw e are treat-
i ng the rul es as organi sms: the l arger ani mal s
2.2 Tit. L,EAF Proqrarn need more " food" . GFA C TOR can be set by the
user to al ter the bal ance betu.eengoodnessand
L E,A F is f ar s im ple r.It ta k e s a d a ta fi l e i n th e si ze.A hi gh GFA C TOR asks for a good rul e at,

;j,
:"j1+!t1,i tlil
j#:*tii
IOZ RICHARD FORSYTH

alm os t,a n y p ri c e ; a l o w s e tti n gi s a bi as torvards so on, l eavi ngthe pruned tree rvi th the sam evalue
br ev it y. but expressedmore succi nctl y. The r esult of
Hav in g b e e n ra n k e d th u s , th e b re edi ngbegi ns. TIDYing
T he t op q u a rte r (2 5 % ) a re l e ft a l o ne. They are
good e n o u g h to s u rv i v e u n to u c h e d.The second
( ( ( sB Y 4 )G r 1 6 A
) N (*n EQ #8))
quar t er a re a l l s u b j e c te dto a p ro c e dureGR OW would be
whic h a d d s a n o d e c o m p o s e d a t r andom. For
(*n EQ #8)
ex am p l e ,GR O W o n
( ( * 1 O L ( # 2 E Q o ) )G r 6 2 ) sirrce(5 BY 4) : 20 and (20GT 16): + I (true).
Then the next generation begins.The process
m ight p ro d u c e continues for the requirednumberof generations,
( ( * t P L U S5 )O L ( * z e q 0 ) )G r 6 2 ) ) and then the new rulcsare printedonto the out-
put file.
Rules in th e th i rd q u a rte r a re s u b j e ctedto a pro-
c edur e n a m e d S L IM w h i c h i s th e obverse of
G RO W : th e y l o s e a ra n d o ml y s e l e ctedterm or S O M E T E S T SO F H T R B
s ubex p re s s i o nT.h e y h a v e s u rv i v e db u t are suffer-
ing f r om " ma l n u tri ti o n " . F i n a l l y th e bottom 25)( The questionis: doesit rvork'?
ar e s ub j e c te dto a p ro c e d u rec a l l e d KILL rvhi ch. To establ i sha comparati ve standard t he dis-
s queam i s h re a d e rs ma y b e a s s u re d. causes no cri mi nant anai ysi sfuncti on of the S P S S package
pain. on the D E C S ystem-10hbrary w as run r vit h t he
T o r e p l a c eth e d e a d ru l e s n e w o n es are formed hospi tal admi ssi ondata. It produced two iincar
by m at i n g to g e th e re i e me n tsfro m the top hal f of functi onsof sevenvari abl espl us a constant .Bot h
t he lis t . In te rn a l l y th e ru l e s a re h e l d as bi narl ' tl-resefunctions are to be evaluatedfor each case
t r ees .T h e MA T E, p ro c e d u reta k e sa random sub- and i f l uncti on 1 gi ves a hi gher val ue the sam ple
tree from one parent rule selectedat random from i s assi gnedto group 1 (l i vi ng) w hereasi f f unct ion
t he upp e r h a l f a n d c o mb i n e s i t rvi th another 2 g i v e sa h i g h e r v a l u e t h e s a m p l ei s a s s i g n e dt o
c hos enl i k e rv i s eT, h e tw o p a rts a re th en l i nked by group 2 (dead).(There \\/ere70 survi vors and 43
a r ando ml y s e l e c te dc o n n e c ti v eto gi ve a ful l y deaths, but thi s i nformati on w as not used t o
f or m ed e x p re s s i o nF. o r e x a m p l e .th e mati ng of w e i g h tt h e p r i o r p r o b a b i l i t i e s . )
( ( * + G r 6 2 )A N ( # 3 E Q 0 ) ) The di agnosti c vari abl eschosen w ere, in de-
scendi ngorder of i mportance.numbers 6 ( m ean
with arteri al pressure).9 (mean venous pre ssur e) .4

( ( * t ' t B Y - 2 ) P L U S( ( + t s G T s ) o L ( # 2 L F .

m i g h t r e s u l ti n (shocktype),14 (urrnaryoutput). 10 (bod y sur f ace


((*+ GT 62)LESS # 8). area),15 (pl asma vol urne i ndex) and l 6 ( r ed cell
i ndex).A l l u,ere posi ti vei y l oaded on fu nct ion 1
T he n e x t s te p i s to a p p l y th e M U T A T' ION pro- except9 (venousprcssure). The C P U ti nr e t o gen-
c edur et o a fe w (ra n d o ml y s e l e c te d)of the l ow er eratetheseresul tsw as 2.85seconds.
7/ 8t hso f th e ru l e l i s t. T h i s c a n d o v ari ous thi ngs W hen re-run on the trai ni ng set data t he dis-
lik e alt e ri n g te rm s , s * ' a p p i n g s u b trees.al teri ng cri mi rrantfuncti onscorrectl ycl assi fi ed75\ of t he
oper at o rsa n d s o fo rth . (T h e to p 1 /Sthi s i nvi ol ate: cases.Ti te mi stakesrvere:16 of group 1 classedas
r ules t h a t h i g h c a n o n l y b e c h a n g e d i f a better g r o u p 2 . 1 2 o l g r o u p 2 c i a s s e di i r g r o u p 1 .
' s t r ain'
d i s p l a c e sth e m.) The FIERB progrant \\'asthen run on the sanle
F inal l y , p ro c e d u reT ID Y i s a p p l i e d to al l rul es. data, starti ng compl etel y from scratch-i. e. u, it h
T his c u ts d o * ' n re d u n d a n c i e ss u ch as doubl e no pre-determi nedrui es. For al l the test s t he
negat iv e se. x p re s s i o n su ' i th a c o n s ta nt val ue and number of rul e: n' as fi xed at 48. A fter 111 gener -
I i'.
.t !
1
^ -
. i , 1
- - .
i ,
. -

' il:

BEAGLE I O_1

at io ri sa ruul of LE A F in d i c a te d th a t th e to p ru l e Class Events


w as co rrJ c t ly gr ouping 7 3 \ o f th e c a s e si n th e 1 1 0 0r n .2 0 0m , 1 1 0m h u r d l e s
t ra i n i n g sc t . T his t ook a b o u t 2 m i n u te s o f ru n - 2 400m. 400m hurdles
t rme . 3 800m. i500 m, 3000m steeplechase
Afte r 5t ) 0gener at ionsth e to p ru l e w a s c o rre c t 4 5000m. 10000m, 20 km walk
on 8 1 o 7i ni t he c as es( c ou n ti n ga 0 , o r d o n ' t-k n o u ,. 5 Marathon,50km u'alk
as i n co rrr c t as well as a n y o u tri g h t mi s c l a s s i fi c a -
t ion s). The various payoffs were assi gnedaccordi ngl y.
Th e to l. r ule at t his s t a g ew a s ActualClass
Rulc Decision 12345
( # 6 G E ( 6 1L E S S # 1 4 ) ) -l I 1 I O -l -)
0 00000
r vl re re =r, i5 m ean ar t e ri a l p re s s u rea n d # I4 i s +1 2 -1 0 | 2
ur i n a ry c,. . iir utW . hat it s a y si s th a t i f me a n a rte r-
ial p re ss, : . r(' m m Hg) is g re a te r th a n o r e q u a l to A deci si on of + 1 i s i nterpretedas i ong-di stance
ur i n a r_ v,.. r lput ( m l, / hr ) s u b tra c te d fro m 6 1 th e c o m p e t i t o r .- 1 a s s p r i n t e r .
pa ti e n t s,. , . uld s ur v iv e, o th e rrv i s eth e p a ti e n t i s A fter 666 generati onsthe top rul e n' as
lik e l y to .j . c .I t s m is t ak esw e re : 2 s u rv i v o rsc l a s s e d ( ( 1 5 5L E S S # 3 ) P L U S ( - 5 B Y # 4 ) )
as g ro l rp l: 20 deat hs c i a s s e da s g ro u p 1 . (T h e
pa yo ff n r. : ir iXc ould hav e b e e n a d j u s te d i f th e s e which was onll' making one mistakeon the 32
diffe re n tr.. : r dS of er r or we re n o t e q u a l l y c o s tl y .a s sampl esi n the trai nrng set.W hat i t says.i n bri ef,
no doubi ,rt-ruldbe the casein practice.) is that if )'oLr are rvhite and rveigit over 155
It i s n e ,: . r ble t hat alr ea d yu ' e h a v e a ru l e th a t i s pounds you are a spri nter,i f y' ou i vei gh l ess you
be tte r ti t., . .t he linear dis c ri mi n a n tfu n c ti o n s :a n d are a di stancerunner. i f 1,ouare bl ack and rvei gh
s o mu ch \ r m pler t hat a h o s p i ta l o rd e rl y c o u l d o v e r 1 5 0p o u n d sv o u a r e a s p r i n t e r .o t h e r w , i syeo u
ea si l ya n i ) it . ( I s t his a d a n g e r? ) a r e a l o n g - d t s t a n creu n n e r . i"4o r4,:.il :

s.=!.-iir-i

Pe rh a p . s t at is t ic ians rv , h o a re o n th e n ' h o l e A s a t e s t 1 2 g o l d m e d a l l i s t sf r o m t h e 1 9 8 0
qu i te cr)i . 1. , ' nt t o c om pute ri z ete c h n i q u e sw o rk e d N{osco*' Olympics \\'ere rated by this rule. This
out by lr'.j,rson and Fisher over 50 years ago and w ' asl resh data. not used i n the trai ni ng phase.A l l
r vh o te n tj it ) r eg?r d ev en Ba y e s i a nd e c i s i o n -m a k - \\' erecorrectl y categori zedexcept P i etro Mennea
lng as ii:r erciting but not every'respectable who. at 150 pounds.is a bit iight for a w'hite
n o v e l t y ' . . : , o u l dw a k e u p t o t h e p o t e n t i a l o f spnnter.
t od a y'sc'. , ' r erst y s t em s . N ,B . These fi _sures appl y to Ol ympi c athl etes:
j ust becauseyou $' ei _eh over 155 pounds do not
A se cr.: . .t]es t was r un o n d a ta c o n c e rn e drv i th
t he p h 1 'sr. . . rof e m ale at h l e te s .H e re th e d a ta w a s get the i dea that vou are a march for A l l an W el l s!
t h e a g e r = 1 ) , h e i g h t i n i n c h e s( * Z S ^ w e i g h t i n
pou n d s ( -= . 1)and r ac e ( * + 1 o f th e m e d a l l i s tsi n
the runnrns and walking events of the 1968 5 REMARKS
Me xi co C) lv m pic G am e s . R a c e w a s e i th e r 0
( rvh i te )o r 1 ( biac k ) .( O ne J a p a n e s ew a s a rb i tra ri l y I see three j usti fi cati onsfor thi s ki nd of exer-
assi g n e di ., r ac e 0 and M o h a m m e d Ga mmo u d i . ci se. Fi rstl y. i t i s i nteresti ng i n i ts ow n ri ght.
rvho appr-i,i-ed tw'iceby virtue of rvinning medals secondly',the rules behave in an interesting fash-
in two d i ;:' : r entev ent s ,w a s c l a s s e da s 0 th e fi rs t i on: and thi rdl l ' . i t seemsto rvork.
time and i the next:he is Tunisian.) In the fi rst pl ace i t i s fun to try a l i ttl e abstract
Th e a i nt \ \ ' as t o ar r ive a t a ru l e th a t rv o u l d gardeni ng,grorvi ng an orchard of bi nary trees.
disti n g u i s ; rt he s pr int ers fro m th e l o n g -d i s ta n c e And it might be fruitful in another sense.After all.
me n o n ti r e bas is of t he d a ta a b o u t a g e , h e i g h t, \\/e are only here by courtesy of the principle of
rveightaitc.lrace.The eventswere actually put into natural sel ecti on,A I rvorkersi ncl uded,and si nce
5 cl a sse si .r om s hor t es tt o l o n s e s t. i t i s so porverfuii n produci nq naturai i ntel l i gence
164 R]CHARD FORSYTTI

it behov e su s to c o n s i d e ri t a s a m e th od for cul ti - over a liirge mass of figures and extract an eli-
vating the artificial variety, ci ent cl assi fi cati onrul e for the casesthere wouid
T he s e c o n dj u s trfi c a ti o ni s th e s u rp ri si ngl yl i fe- be l i ttl e need for thrs ki nd of program.)
lik e beha v i o u r o f th e ru l e s th e m s e l v es.It can be The ti ri rd poi nt i s that the systemw orks quit e
appr ec ia te db y l o o k i n g a t th e to p 4 rul es pro- w el l , even though thi s i s versi on 1.0 of the pr o-
duc ed by Il ER B o n th e h o s p i ta l a d mi ssi onsdata gram. The rul es produced are short and t o t he
a f t e r 1 , 1 1 , 1 1 1a n d l l i l g e n e r a t i o n s . point. though it is fair to mention that the com-

1 generation Age Score


(# I PLUS0) 127
(2 PLUS 0) 127
( # 1 P L U S0 ) 121
( # 1 P L U So ) t21
l2l : chanceexpectation]
1l generations
( N O ( (# 1 6 L E - 1 )B Y ( # 6 G T s 3 ) ) )
(*6 GE *t7
( ( ( * t 6 L E - 1 )B Y ( # 6 G T 5 3 ) tI - r
( ( + t e q # 1 1 )N E - 1 o o o o ) ) +
A
)Y
( # 1 P L I j S0 ) 10 21
11 1g e n e r a t i o n s
( 1 3 7O V E R( + 6 G T _ s 3 ) ) 71 51
( 1 3 5O V E R( * 6 G T 5 3 ) ) _s5 5l
( 1 O V E R( # 6 G T s 3 ) ) .A+ Il 51
( 1 3 7O V E R( + 6 G T 5 3 ) ) ')-)
-1 _) 51
1 11 1g e n e r a t i o n s
( # 6 G E -( 6 1L E S S# 1 1 ) ) 691 frq

( # 6 G F .( 6 2L E S S# 1 1) s02 69
( ( 6 1L E S S# 1 4 )L E # 6 ) 119 69
( ( 6 1L E S S# 1 1 ) L T # 6 ) 418 69

W hat \ \ / e s e e h e re i s th e a p p e a ra n ce(and sub- p u t i n g c o s t i s q u i t e h e a v y- - a l m o s t 2 m i i t u t e so f


s equent d i s a p p e a ra n c e )o f d o m i n a nt " speci es" . r u n t i m e p e r 1 0 0 g e n c r a t i o n so n t h e D E C S ) , s -
E ac h t ) ' p e fl o u ri s h e sfo r s o m e ti me unti l qui te t e m - i 0 ( K L 1 0 p r o c e s s o ro) n t h e h o s p i t a ld a t a .
s u d d e n l vs u p p l a n t e db y a n e w a n d s u p e r i o rl r n e- ' A s B E A G I E i s q u i t e s u c c e s s f uol n t o y d a t a -
t y pic alll' a mu ta ti o n o f o n e o f i ts o u ' n offspri ng. basesthe readeru,i th garnbl i ngi nsti nctsma) ' car e
W h e n t h i s h a p p e n st h e e x t i n c t i o n o f t h e m o r e t o p a r t i c i p a t ei n a l r t t l e f i e l d t e s t r n go n f a r m o r e
pr im it iv e fo rm s i s ra p i d a n d c o m p l e te. nressydata. For rvhat i t i s l r' orth. i rcre i s t he t op
I t s eemsth a t o n c e a ru l e fa s te n so n a parti cul ar rul e producedb1' IIE R B after 400 genera t ionson
indic at or v a ri a b l e o r c o mb i n a ti o n o f vari abl esi t a d a t a f i l ec o n t a i n i n g 1 0 0 0 E n g l i s h a n d S c o t t i s h
will giv e ri s e to s e v e ra l c o p i e s o r near-copi es L e a g u ea n d C u p f o o t b a l lr e s u l t s( 1 9 7 9 - 8 0 ) .
f or nr ing a fa m i l y rv h i c h th ri v e u n ti l a better rul e
(NO ((#58 OVER #45)AN #7t\l
appear s .p o s s i b l yu s i n g e tne n ti re l yd i fferentset of
indic at ors .It i s a s i f th e n e w v a ri e ty h ave found a I t s v a l u e i s m e a n t t o b e t r u c ( p o s i t i v e f)o r d r a u ' n
-fhe
m or e nou ri s h i n g" d i e t" . games. negati ve otherrvi sc vari ablcs ar c:
T h e r e i s n o t h i n g t o p r e v e n tt h e u s e r i n s e r t i n ga # 58 the a\\' a\/team' s grourrd capaci ty (i n t hoLr -
m an- m ad eru l e a t a n y s ta g e ;i n d e e d i t i s sal utorl ' sands of spectators)subtracted from the hont e
t o do s o s i n c ea l l tra c eo f i t i s n o rm a l l y l ost rvi thi n teA rn' scrorvd capi i ci ty (thousands): #45 goals
a fervgenerations.(If it were easy to cast 1,'oureye scoredby the a\\' ay si de rn thei r l ast aw al/ gant e:
1,*,.
'*rgi!;

BE,{GLE 165

# 1 1 d : : l c r e n c ef o r m e d b y a d d i n g h o m e r e a m ' s fi ngers" u' hose uel fare dependedon successful l r


h o me s , \ : 1lS s c or edin th e i r l a s t 8 h o m e m a tc h e sto managi ngthe fi rst-l evelones.
a wa y t c . : in' Sgoals c o n c e d e di n th e i r l a s t 8 a \\' a v A more seri ousneed i s to make greateruse of
g a me s. , : t r 'si ubt r ac t i n gh o me te a m ' sg o a l s a g a i n st i nformati on provi dedby bad rul es.rrhcr than j ust
p l u s a r ' , . : .t\eam ) sgoa l s fo r i n th e s a m ema tc h e s . di scardi ngthem and qui te possrbl -v- regenerati ng
them l ater. Thrs i s a maj or r.r' eakness
and al l the
obvi ous remedi es (e.g. a rote memorv oi bad
6 FL 1 L- RE DI RE C T ION S rul es)rvoul d i ntroduceconsi derabl eoverheads.

BE.. lt ) l, E is s t ill a t a p ro to ty p e s ta g e .a n d c a n
b e co n. . ic r ably im pro v e d .On e p i a n n e d e n h a n c e - 7 CONCLUSIONS
me n t \\ . : : r ' r ) eDt ion ei d n s e c ti o n2 .i -a l i o w i n g th e
mu l ti - s t . , r elogic r ang eto b e s p e c i fi e db y th e u s e r. I rl ' oul d l i ke to presentN atri ral i sti cS el ecti onas
A s c i, . ld ex t ens io nth a t q ' o u l d n o t b e d i f' h c u l t a v i a b l eA I t e c h n i q u cT. h i s i s r r o r t o s a v i r i s a
to rm p. r m ent would b e to a l l o w , l i o a ti n g -p o i n t pai racea.I suspectthat there i s ci rl cri .sa better
a ri th nt - ' :, - as r v ellas i n te g e rs ,th o u g h th e i n te ra c - w a y ( c h e a p e r a n d , L o rq u i c k e r ) . I u r n o s i n g i e
ti o n ri r : : . lt t gic alv alu e srv o u l d h a v e to b e c a re fu l l v nrcthod rs more appropri ate for " sati sfi ci ni r"i n
c o n s i d i : - ' Jf i r s t .I t m i g h t b e a n o p p o r t u n i t ) 8t o s u c h a r v i d ev a r i e t yo f p r o b l e n t s . e
i n t r o d L-.: ' f u z z l tl o g i c ' a t t h e s a m e t i m e . ( H E R B O n t l t e c r e d i t s i d e N a t u r a l i s t i c S e i e c t i o ni s
a n d L I : . ll- alr eadyus ea " s i i _ s h tl t, fu z z y " c o m p a ri - a b s o l u t e l vg c n e r a l .A u s e r c a n i r u r l a n y d a t a a t
s o n s c , : J : l r ' s u c ht h a t 6 ; 1G E 6 5 i s n o t q u i t e s o H E R B hor.r' cver noni i near.ho* ' ever " r.toi sv"horv- .
fa l sea. , - r )G E 65, b u t th e u s e fu l n e s so f th i s h a s e v e r m u c h t t v i o l a t e st h e a s s u m p t i o n sa b o u t d i s -
n o t b c c : .. r s s e s s e d . ) t r i b u t i o n a n d s c a l i n gu n d e r i v i n gm ( ) s t s t a t i s t i c a l
A n o : ; . , ' ; p- l a n n e di m p r o v e n t e n it s t h e i n c l u s i o n tests.and gct a reasonabl eset of di scri rni nati on
o l a d c i: , nal oper ato rs .T h e MOD (re ma i n d e r ) rul es i n reasonabl eti me. A nd si i tccthose.rul es are
f u n c t i t , : .r i i l l b e o n e , b u t n t o r e i m p o r t a n t r v i i l b e pLrbl i cand conrprchensi bl c. not arcancteci tnocra-
th e p ar : \ O . . O S t o a l l o rv c o n s tru c ti o n so f th e t i c b i a c k m a g i c . m a n - m a c h i n ec o o p e r a t i o ni s f a -
form ci htated.The hunral tcan do some of the hypothe-
(B SO (At OS A2)) s i z i n g( w ' h i c hp e o p l ea r e g o o d a t ) l e a v i n gr h e t e s t -
i ng (u' hi chpeopl eare bad at) to the compr-rter.
rvhich ..r':rc-Sfof F o r e x a m p l c .t o c l a s s i i yt h e t e s t d a t a u s e d i n
thi s paper a sequenti aideci si onproceduresLrchas
i f B t i r e n A 1 e l s e4 2
that proposcd b1,'H unt10 rni ght have beerl more
a n d u j l , : r r e a r u d i m e n t a r l ' p r o g r a m m i n ga b i l i t 1 , . e c o n o n t i c a lb: u t t h e t r o u b l e w ' i t h s t e p * ' i s ea l g o r -
O f c o t u - - 't h i s h i g h l i g h t st h e f a c t t h a t t h e r u l e sa r e i thms rvhi ch 1,' i el da di scri mi nati on nct or pro-
re a l l t, ir r , r s r am s in a s p e c i a l -p u rp o s el a n s u a q e . gressi vefi l ter netrvork rs thei r susccpti bi l i tyto
i v h i c i r i r . : g h ti e a d t o t h e c o n c l u s i o nt h a t t h e s v s - noi se i n the t1ata.Ti rel ' w ,ork best i n si tuati ons
t e m s i r , ' , r l du l t i m a t e i y g e n e r a t e L I S P f u n c t i o n s . ri ' herethere can be no error i n tl i e trai ni ng data
B u t t h i . : s r a t h e ra d i s t a n tg o a l . I t w ' o u l dr e m o v e and rvherethe t,ari abl esi rave rathcr feu' di screte
a l l r e s t i : - ' t i o n sb, u t w i r e t h e r I I E R B o r a n y t h i n g val ti es(e.g.chessendgames). H E R B though rarel y
l i ke i t e, r ild c ope u, ith th c e x tra p o w e r re ma i n sto i f ever opti rnal .ri ' i l l al rnost ai * ' avs comc up u,i th
b e se cn l) r obablva c o mp ro n ri s e s. u c l i a s g e n e ra - s o n r e t h i n gu s a b l e .
l i zi n g t : : - . r ules as f ar a s p ro d u c ti o n s r,s te mso f a L a s t l l t h e r e i s t h e m a t t e r o f r n t a g e .I s t h e
l i n ri tec i, t r m pler it \ ' .u, o u l d b e m o re m a n a q e a b l e . dcsi gnerof expert s)' stentsto be seel tztsa soul i css
In ef i' : ; t t he r ules a s th e ), s ta n d s i m u l a tes i n _ e l e- u' i ri te-coated rnachi ne-rni nder or as someoneu' l i o.
ce l l e d . ls anis m s . wit h o u t s p e c i a l i z a ti o no f fu n c - for the fi rst ti rne si ncethe expul si onfrom E cl en.i s
'l - r '
ti o n . m ov e on t o a h i e ra rc h i cs tru c tu re .c o rre - not merel i ' pi cki ng neu, frui t from ti rc l orbi dden
s p o n d r i r gt o m u l t i - c e l l u l a ra n i m a l s H E R B u ' o u l d (B i nary' )Tree of K norvl edge.but actual i l ' rnaki ng
n e e d l n d - l e v e lo r _ s a n j s m(ss t r a t e q i e sn). i t h ' . g r e e n it _qrow.)
RICFIARD FORSYTH
166

I/E RB and LE A F 3 . A . B e r n s t e i na n d I . R u b i n . " A r t i f i c i a l E v o l u t i o n o i P r o b -


I P A SC AL s o u rc el i s ti n g so f l e m - S o i v e r s "A n t e r i t ' a nB e h a t ' i o t o ' uSl c i e n l i s t( M a y ' I 9 6 5 ) '
ut . av a i l a b l e o n re q u e s t l ro m the author at: 4 . L . J . F o g e l e r a l . .A r t i . l i c i u lI n t e l l i g e n c et f u o u g h S i n t u l u t e d
Maths Dept., Polytechnic of North Londor-r' E u o l u t i o n( W i l e y . N e r v Y o r k . i 9 6 6 ) '
5 . O . G . S e l f r i c i g e".P a n d e m o n i u m a P a r a d i g m l o r
I eernino"

N7 8D B , U .K .] ,iV.P.LS,r'tnpo.siunt on trIechani;atiott of Thought P r o c e s s e s


(HMSO, London, 1959).
6 . A . G . B e l l , G a n r e sP l a v i n g t t ' i t hC o m p u t e r s( A l l e n & U n w i n .
London. 1972).
RE F ER EN C E S ?.A.A.AfiziandS.P.Az-en,stqtisticul.4nal-l'sis:AConrltuter
, eu'York' 1972)'
O r i e n t e t .l A p p r o u c (hA c a d e m i cP r e s s N
"Model-Directed 8. L. A. Zaden, "Fuzz1' Sets" Infornwlion and Control 8
l. B. G. Buchanan and T. M Mitchell (1 9 6 5 ) .
Hayes
Learning of Production Rules" in Waterman &
(Academic 9 . H . A . S i m o n . T h e S c i e t t c eo . f t h e ' A r t i f c i a l ( M l T P r e s s '
Roth (eds) Pattern'Directed lnferertce S']'slerns
M a s s . ,1 9 6 6 ) .
P r e s s ,N e w Y o r k , 1 9 7 8 .
Lon- 1 0 . E . B . F I u n t e t a l . E x p e r i m e n t si n I n d u c t i o r l ( A c a d e m i c
2 . G . P a s k . A n A p p r o a c h t o C y b e r n e t i c s( H u t c h l n s o n '
P r e s s .N e * ' Y o r k . I 9 6 6 ) "
don. 1961).

You might also like