Professional Documents
Culture Documents
Artificial Intelligence
Artificial Intelligence
Learning ,n t(is paper# learning refers to ac?uiring truly new e(aviour# as oppose" to just mo"ifying t(e parameters of alrea"y e1isting e(aviours* Learning is an attractive prospect# ut t(ere are serious concerns t(at nee" to e a""resse"* @owever# it is possi le to create a successful game t(at features real learning# as for e1ample Alack B >(ite 3C5 "emonstrates* Gamers often prefer online gaming against ot(er (umans over playing against A, opponents* @owever# sports games involve entire teams of c(aracters# out of w(ic( t(e gamer only controls one c(aracter at any given time* Goo" A, is t(en still important# since it must e a le not only to work against t(e opponent# ut also to work with t(e (uman* ,n a""ition# contrary to many ot(er games# t(e c(aracters in a sports game ten" to not "ie "uring t(e game* $(is gives t(em ample time to "isplay t(eir level of intelligence* Dne of t(e concerns of releasing a game t(at learns after it s(ips is t(at t(e A, is muc( more "ifficult to test* ,t is wort( noting# t(oug(# t(at t(e "evelopers Alack B >(ite feel t(at t(e testing pro lem is actually ?uite managea le 3&5* Anot(er concern is t(at t(e A, must not learn t(e wrong lessons from gamers w(o are# possi ly "eli erately# eing incompetent* 7inally# t(e learning met(o" must of course e a le to run in realtime wit(in t(e (ar"ware constraints of game platforms* Commercial games are "ifferent from Ro o<Cup soccer an" most ot(er games t(at aca"emics (ave tra"itionally stu"ie"* $(e goal in a commercial game is not to win# ut to entertain* >(en learning from a loss# an a"aptive A, s(oul" not ensure t(at it never loses again E just t(at it never again loses in the same way* Sports games Sports games A, faces one specific c(allenge t(at is a sent from many ot(er games genres* $(e game is to simulate events t(at actually (appen in t(e real worl" on# an" most gamers will e ?uite familiar wit( w(at t(ose events look like* !very feature of sports games carefully tries to re<create t(e real worl" e1ample as fait(fully as possi leF t(e A, s(oul" e no e1ception* $(is goes not only for t(e e(aviour of t(e in"ivi"ual players# ut also for t(e team as a w(ole* Dne uni?ue feature of sports games is t(at i"eally t(e c(aracters an" teams s(oul" e(ave like t(eir specific real<worl" counterparts* ,n a asket all game# t(e S(a?uille D+Geill c(aracter s(oul" never go for a lay<up instea" of a "unk# an" Dennis Ro"man s(oul" never take a C< point s(ot* ,n a soccer game# playing against AraHil s(oul" feel "ifferent from playing against ,taly* $(is makes t(e strategy san" o1 even smallerF not only s(oul" t(e a"apting AraHilian strategy stay sensi le# it s(oul" stay AraHilian* Ae(avioural mo"els make t(is possi le* Behavoural models $(e 7inite State Iac(ine -7SI. is a commonly use" para"igm for game A,# appearing in classical an" fuHHy varieties* Dne feature of t(e 7SI is t(at a c(aracter is y "efinition always in just one particular state* Suppose t(e creature in 7igure 1 can e in states like 0afrai"2 flee from enemy0 an" 0(ungry2 searc( for foo"0* ,f t(e creature c(ooses to flee# it will also move away from all t(e foo"* ,f it c(ooses to fin" foo"# t(e (unger state mig(t tell it to (ea" for t(e nearest foo" source# ut t(at will take it "angerously close to t(e enemy* >(ile eing (ungry# it forgot all a out its ot(er goals in life*
,n a behavioural mo"el 315# suc( drives are all active simultaneously# an" t(ey (ave varying levels of influence* ,n t(e mo"el known as 0sc(ema< ase" coor"ination0# eac( e(aviour generates a force fiel"# pus(ing t(e creature in a certain "irection* $(e force "riving t(e creature is t(e sum of all t(ese influences* ,n 7igure 1 t(ere woul" e attractive forces from ot( foo" items an" a repelling force from t(e enemy* $(e result is t(at t(e creature can satisfy ot( "rives* Io"els like t(ese are use" for e1ample in t(e Sims# wit( t(eir attractiveness lan"scape# an" in Ro o<Cup soccer 3'5* ,f a game A, engine attempts to calculate ;w(et(er or not= pre"ictions# suc( as w(et(er or not a c(aracter can reac( a particular goal or w(et(er or not it s(oul" e1ecute a certain task# t(e resulting e(aviour can e rigi" an" pre"icta le* $(e pro lem is analogous2 Since t(e "ecision is always t(is or t(at# t(e c(aracter is confine" to a patc(work of regions of i"entical e(aviour* Ae(avioural approac(es can pro"uce more "ynamic an" flui" e(aviour* ,n sports games like soccer# (ockey# an" asket all# t(e "rives t(at govern an agent can e t(ings like 0intercept all0# 0stay onsi"e0# 0attack goal0# etc* $(e resulting e(aviour satisfies all t(ese goals as muc( as possi le* !ac( "rive (as a certain "irection an" strengt(* $(e strengt( can "epen" on (ow important t(e "rive is at a given instant* 7or e1ample# 0intercept all0 is not important w(en a teammate (as t(e all* An a""itional a"vantage of e(avioural mo"els is maintaina ility* >(en a new state is a""e" to a 7SI# t(e programmer nee"s to figure out all t(e new state transitions# as well as remem ering w(at ot(er alrea"y e1isting goals nee" to e satisfie" at t(e same time in t(e new state* Ay contrast# a""ing a new e(aviour in a e(avioural mo"el "oes not invali"ate or "uplicate t(e alrea"y e1isting e(aviours# since t(ey are all active simultaneously* !ac( e(aviour just nee"s to know (ow important it is at any given time* $(ese urgency levels present an important opportunity2 $(ey can e su ject to learning* A learning example >(en somet(ing un"esira le (appens# suc( as a goal score" y t(e opponent or may e even just a s(ot on goal# it may e possi le to go ack an" a"just t(e urgency levels of t(e various e(aviours in or"er to stop t(e same t(ing from (appening ne1t time* Dne coul" even a"" a new e(aviour for eac( mistake# taking care of avoi"ing t(e same mistake in t(e future* !ac( learning sample contains information a out t(e situation in w(ic( it occurre"# an" information a out w(at t(e agents s(oul" (ave "one a out it* $(e latter piece of information is
a "rive pus(ing t(e agents in a "irection t(at (opefully avoi"s t(e mistake* $(e urgency of t(is "rive "epen"s on (ow similar t(e current situation is to t(e learning sample* Dne can t(ink of ways to figure out (ow similar two situations are# an" ways to "o t(is ?uickly suc( t(at t(e w(ole proce"ure is computationally feasi le un"er real<time constraints* Aut (ow to figure out t(e "rives << w(at s(oul" t(e agents (ave "one to repeat a certain mistakeJ 7igure & s(ows a learning e1ample in 7,7A&88&* ,t enco"es a situation t(at occurre" in t(e past# w(ere t(e lue team en"e" up scoring* $(e trace s(ows t(e trajectory of t(e all "uring t(at play* >(en t(e trace is soli" lue# one of t(e players of team Alue# in"icate" y t(e jersey num er# (as t(e all* >(en t(e trace is "otte"# t(e all is un"erway from one player to anot(er* >(ite "ots in"icate t(at t(e all is close enoug( to t(e groun" to e intercepte" y a playerF if t(e all is too (ig( in t(e air# t(e "ots are re"* $(e se?uence starts wit( a goal kick y t(e lue goalkeeper# an" en"s wit( player % scoring a goal*
,f t(e A, (as not learne" anyt(ing from t(at se?uence# t(en t(e same events can (appen again* 7igures C an" K s(ow two snaps(ots of t(e play se?uence as it unfol"e"* ,n ot( cases# one of t(e lue players is a out to sen" off a pass*
7igure C s(ows player C a out to pass t(e all to t(e location in"icate" y t(e "otte" circle* Llayer 11 will run to t(at location to receive t(e pass* @e can "o t(is ecause (e is closer to t(e "otte" circle t(an any of t(e opponent6s players* 7igure K s(ows player 11 a out to cross t(e all in front of t(e goal* $(e cross is targete" at player %# w(o connects an" scores t(e goal* Again# t(e player was a le to "o t(is ecause (e was closer to t(e reception point t(an any of t(e opponents* >(at mig(t t(e ot(er team (ave learne" from t(isJ $(e mistake was mostly "ue to "efen"er 9# w(o allowe" attacker % to slip past (im* ,n t(e first snaps(ot# attacker % was still far e(in" (im* Defen"er 9 coul" (ave prevente" t(is y moving closer to t(e key spot in 7igure K* At an earlier point in t(e play# "efen"er & or % coul" (ave interfere" wit( attacker 116s activities y moving closer to t(e key spot in 7igure C* $(e "efen"ing team "oes not nee" to prescri e who goes to t(ose spots# just t(at someone "oes* Ay t(e same token# it "oes not matter w(ic( one of t(e attacking players (as t(e all# just t(at (e (as t(e all in a location t(at resem les t(e one in t(e learning e1ample* ,t is t(e pro1imity of t(e all to one of t(e key points t(at matters* $(us t(e learning e1ample "oes not give strategic (ints to specific players# ut rat(er to specific areas of t(e fiel"* $(e same learning e1ample can ecome active in situations t(at are not i"entical ut similar# w(en t(e all is near t(e trajectory of t(e learning e1ample* ,t can also e use" y t(e attacking team# in or"er to try an" repeat a successful play* Force fields $(e various "rives of t(e players act as force fiel"s* Dne of t(e "rives is t(e one t(at represents t(e a"aptive strategy* Lseu"oco"e for t(is force fiel" is given in t(e appen"i1* $(e e(aviour learne" from t(e e1ample in t(e previous section is a force fiel" t(at is anc(ore" to t(e playing fiel"# not to any player in particular* $(is follows t(e a"age 0t(e intelligence is in t(e environment# not in t(e ant0# as in t(e Sims# w(ere t(e instructions on (ow to use an o ject are containe" in t(e o ject# not in t(e c(aracter t(at uses it* A force fiel" specifies its influence on t(e c(aracters# as a function of t(eir location on t(e soccer fiel"* ,n a""ition# it also nee"s to specify t(e conte1t in w(ic( it applies* A particular learning e1ample is relevant only if t(e current situation is similar to t(e one t(at (appene" in t(e learning e1ample* $(us t(e force "epen"s on two parameters2 location# an" conte1t* >(en learning is triggere"# t(e algorit(m first nee"s to c(oose w(en t(e relevant play se?uence starte"* ,n t(is case t(e start of t(e se?uence is "efine" as t(e latest time w(en t(e scoring team gaine" possession of t(e all# or t(e latest re<start of play# w(ic(ever (appene" later* Ge1t# t(e force fiel" is calculate" y making all t(e points on t(e all trajectory attractors* $(is e1clu"es t(e points w(ere t(e all was (ig( in t(e air# in"icate" y re" "ots# since t(e all coul" not e intercepte" t(ere* $(e attractor forces can e calculate" as a gravity fiel"# "iminis(ing wit( t(e s?uare of t(e "istance* $(is ensures t(at not all players (ea" towar"s t(e same spots# an" also "eals wit( t(e pro lem of a single player caug(t etween (aving to "efen" two spots2 ,nstea" of staying in t(e mi""le an" "efen"ing neit(er spot# ot(er forces will make t(e player "rift towar"s one of t(e two spots an" t(en t(e stronger attraction will force a commitment to t(at one* 7or t(e learning e1ample of 7igure &# t(e resulting force fiel" is s(own in 7igure 9* $(e fiel" is in"icate" only on t(e points of a gri" t(at covers t(e fiel"* Since t(e fiel" is well< e(ave"# it is sufficient to store t(e fiel" values only on t(ose gri" points an" use interpolation elsew(ere* $(e gri" resolution can e c(osen to meet any storage capacity limits* 7igure 9 s(ows t(e all
trajectory in w(ite# an" t(e salient points on t(e trajectory are in"icate" as w(ite "ots* Salient points are t(ose w(ere t(e all c(anges possessionF t(e all can e possesse" y a player or y ;groun"= or ;air=* $(us t(e points w(ere t(e all c(anges from close to t(e groun" to (ig( in t(e air are also salient points* ,n t(is e1ample# t(e attractive forces are calculate" not for all t(e points on t(e all trajectory# ut just for t(e salient points* $(is may e a sufficiently effective summary of t(e trajectory* Using t(e full trajectory re?uires more computation# ut it only nee"s to e "one once*
$(e secon" parameter of t(e force fiel"# its conte1t# specifies (ow muc( influence it carries in any given situation* ,n fact# it is relevant to t(e /ellow team if /ellow "oes not (ave possession of t(e all# an" t(e all is near t(e trajectory of t(e learning e1ample* $(us t(e strengt( of t(e force fiel" "epen"s on t(e position of t(e all* 7igure % s(ows t(e force fiel" resulting from two learning e1amples# for a situation w(ere t(e all is closer to t(e original trajectory t(an to t(e new one* ,n t(is situation# t(e former force is stronger t(an t(e latter*
As 7igure % suggests# it fortunately is not necessary to maintain one force fiel" for eac( learning e1ample# since force fiel"s are a""itive* 7or eac( position of t(e all# t(e strengt(s of
all t(e force fiel"s can e calculate" an" t(e fiel"s can e a""e" toget(er* $(is results in one net force fiel" for eac( position of t(e all* $(e collection of positions of t(e all can# in turn# also e sample" in a gri" an" interpolate"* At runtime# t(e position of t(e all in"e1es into one of t(e force fiel"s* ,n or"er to fin" t(e resulting force "uring game play# all t(at is nee"e" are four "irect look<ups an" an interpolation* $emporal "iscounting can e intro"uce" y making t(e earlier points more attractive t(an t(e later points# to encourage t(e players to "isrupt t(e play as soon as possi le* $(is increases t(e computational cost# ut again t(at "oes not matter since it is an offline computation t(at only nee"s to e "one once* 7igure : s(ows suc( a fiel"*
Later in t(e same se?uence it is no longer necessary to control t(e points t(at (ave alrea"y een passe"* Dnly t(e remaining points on t(e trajectory are interesting* $(is can e a""resse" y a""ing t(e partial trajectory from eac( salient point to t(e en" point as a learning e1ample# w(ic( is e?uivalent to temporal "iscounting w(ere t(e later points in t(e se?uence are more attractive* $(is "iscounting factor is a parameter t(at can e playe" wit(* Before and after After t(e forces resulting from t(e learning e1ample (ave een calculate"# t(e same situation can e re<starte" to see if t(e /ellow team (ave learne" anyt(ing* 7igure ' s(ows t(e result* $(e play starts appro1imately t(e same* ,t "oes start "eviating a it w(en t(e all gets to player Alue<C# ut t(e pass to t(e outsi"e left wing is still fire" off* @owever# "efen"er /ellow<% (as now altere" (is e(aviour sufficiently to get t(ere in time an" even manage to intercept t(e all* 7igures 4 an" 18 s(ow a closer look at /ellow<%6s trajectory* ,n t(e first case /ellow<% wastes too muc( time efore (ea"ing over to Alue6s pass reception* $(e secon" case starts t(e same# ut t(en sees /ellow<% turning aroun" sooner an" getting ack in time*
iscussion
$(e main focus of t(is learning e1periment is to augment# not replace# any e1isting strategy* $(is may e compare" to t(e su sumption arc(itecture as use" in ro otics 3K5# an" correspon"s to t(e e(avioural approac( of allowing all e(aviours to e active# instea" of only one* An a""itional enefit of t(is is t(at t(e learning component is easy to a"" to an e1isting commercial games program* $(e learning mo"el is "eli erately kept simple in or"er to e c(eaply calculate" at runtime* ,t involves a one<time calculation w(ic( can e "one offline# for instance "uring goal cele ration animations* At runtime# a ta le lookup an" an interpolation suffice* $(e memory re?uirements can e a"juste" as nee"e"# y mo"ifying t(e solution of t(e gri" on w(ic( t(e force fiel"s are sample"* Go attempt is ma"e to "iscover t(eoretically optimal solutions# nor to mo"el or pre"ict game situations as t(ey occur* $(e goal is to let game c(aracters e(ave (uman<like# as well as make sure t(at t(ey "o not ecome un eata leF for ot( purposes# optimality is un"esira le* Less important t(an making sure t(at t(e A, "oes not lose is making sure t(at it "oes not lose in t(e same way repeate"ly* Anot(er re?uirement t(at is kept in min" is t(at t(e mo"el s(oul" e a le to learn from very few e1amples# since t(e e1amples are to e provi"e" y a (uman gamer playing t(e game in realtime* $(e e1amples s(own in t(is paper all involve c(anges t(at occurre" as t(e result of a single training e1ample* ,n general# very few e1amples are nee"e" to a"just t(e playing style sufficiently to "isrupt previous mistakes# w(ile not "istur ing t(e overall strategy* $(ere are several parameters an" options to play wit( in t(e force fiel" mo"el* $(e "ecay rate in space# time# an" relevance can all e a"juste"* $(e forces can e mo"ele" as gravity fiel"s# wit( t(e strengt( of t(e force proportional to 1M r & w(ere r is t(e "istance to t(e attractor point# ut ot(er fiel"s are also possi le* 7or instance# t(e fiel" coul" e proportional to 1Mr & w(en rNR an" rMRC w(en rOR# correspon"ing to t(e gravity fiel" of an o ject of ra"ius R* $(e intuition e(in" t(is type of fiel" is to "iminis( t(e attractive force w(en t(e location is alrea"y un"er control of t(e playerF near t(e centre of attraction# t(e force ecomes Hero* $(e e1periments in t(is paper were performe" using !lectronic Arts6 7,7A &88& software# ut t(ey coul" also e applie" to ot(er sports games# suc( as (ockey an" asket all# as well as ot(er games genres# suc( as Real $ime Strategy games* Ac!nowledgments , am in"e te" to !lectronic Arts Cana"a# an" in particular to Jo(n Auc(anan# Jason Rupert# an" Iatt Arown# for t(eir cooperation# an" to Jonat(an Sc(aeffer for provi"ing fee" ack on early "rafts of t(is paper* "eferences 1* Arkin# Ronal" C* Behavior-Based Robotics. I,$ Lress# 144'* &* Aarnes# Jonty# an" Jason @utc(ens* Testing Undefined Behavior as a Result of Learning. ,n Steve Ra in# e"itor# AI ame !rogramming "isdom# pages %19<%&C* C(arles River Ie"ia# &88&* "! Blac# $ "hite. Lion(ea" Stu"ios M !lectronic Arts# &881* See www* wgame*com* K* Arooks# Ro"ney A* %hallenges for %omplete %reature Architectures. 7rom Animals to Animats2 Lrocee"ings of t(e 7irst ,nternational Conference on Simulation of A"aptive Ae(avior# I,$ Lress# 1448*
&I&A '(('* !lectronic Arts# &881* See www*fifa&88&*ea*com* Ro o<Cup soccer tournament* See www*ro ocup*org* The )ims* Ia1is M !lectronic Arts# &888* See www*t(esims*com* Stone# Leter# an" Davi" IcAllester* An Architecture for Action )election in Robotic )occer * ,n JPrg L* IQller# !lisa et( An"re# San"ip Sen# an" Clau"e 7rasson# e"itors# !roceedings of the &ifth International %onference on Autonomous Agents # pages C1%<C&C# Iontreal# Cana"a# &881* ACI Lress* 4* >oo"cock# Steven* ame AI* The )tate of the Industry '((+-'((' * Game Developer IagaHine# July &88&# pages &%<C1* #! %* :* '* #seudocode Aelow follows (ig(<level pseu"oco"e for calculating t(e force fiel" associate" wit( a new training e1ample# up"ating t(e e1isting force fiel"s# an" "etermining t(e forces at runtime* Let &ield rid an" Ball rid e "iscrete sets of points ot( covering t(e soccer fiel" at some ar itrary resolution* >(en a new training e1ample arrives# containing a all trajectory# a force fiel" ,ew&ield-p. is calculate" w(ere p is t(e position on t(e fiel"*
foreach p in FieldGrid { foreach t in trajectory { NewField[p] += (t-p) / |t-p|2 ! !
Gote t(at p an" t are vectors# an" t(at /t-p/ "enotes t(e lengt( of t(e vector* $(e e1isting forces are enco"e" in 0ain&ield-b1p.# w(ic( enco"es t(e force at fiel" location p w(en t(e all is in position b* $(e parameter b specifies t(e conte1t* $(e new fiel" is a""e" to t(e main fiel" wit( its influence "epen"ing on t(e "istance of b to t(e trajectory*
foreach " in #allGrid { /$ deter%ine d = di&tance of "all to trajectory $/ d = infinity foreach t in trajectory { d = %in(d' |"-t|2) ! /$ add NewField to (ainField["] with &tren)th */(*+d) $/ foreach p in FieldGrid { (ainField["'p] += NewField[p]/(*+d) ! !
At runtime# t(e force at fiel" position p w(en t(e all is in position b can e looke" up "irectly y snapping p to t(e &ield rid an" b to t(e Ball rid# an" looking up Iain7iel"3 #p5 "irectly* ,f t(e two gri"s (ave low resolution# t(e fiel" can e looke" up in t(e surroun"ing points an" t(en interpolate"*