Download as pdf or txt
Download as pdf or txt
You are on page 1of 5

7KHUHVHDUFKRQ(EXVLQHVVRULHQWHG$XWRPDWLF1HJRWLDWLRQ6\VWHP

EDVHGRQIDLWKIXODQGG\QDPLF4VWXG\  

&KHQ3HL\RX/L<LMXQ/L;LQJ
+HLORQJMLDQJ,QVWLWXWHRI6FLHQFH 7HFKQRORJ\35&KLQD
(PDLOFKHQSHL\RX#VRKXFRP
+DUELQ,QVWLWXWHRI7HFKQRORJ\35&KLQD
(PDLOOL\LMXQ#KLWHGXFQ
+HLORQJMLDQJ,QVWLWXWHRI6FLHQFH 7HFKQRORJ\35&KLQD
(PDLOZVOL[LQJ#\DKRRFRPFQ

$EVWUDFW,QYLHZRIVRPHTXHVWLRQVH[LVWLQJLQ4VWXG\WKLVDUWLFOHKDVGHVLJQHGRQHG\QDPLF4VWXG\DOJRULWKPEDVHGRQ$JHQW
WRWKHFXUUHQWQHJRWLDWLRQVHQYLURQPHQWDQGHVWLPDWHGYDOXH4DFFRUGLQJWRFXUUHQWUHQHZHGHQYLURQPHQW7KHVWXG\PHFKDQLVPLQ
WKLVDUWLFOHLQFOXGHVWZRDVSHFWVRQHLVWRVWXG\WKHHQYLURQPHQWXQGHUWKHFXUUHQWQHJRWLDWLRQFRQGLWLRQDVZHOODVEHOLHIRIWKH
PDWFK¶VLQIRUPDWLRQDQGXVHV%D\HVLDQOHDUQLQJWRUHQHZEHOLHI$QRWKHULVWRVWXG\WKHG\QDPLFSURFHVVRIQHJRWLDWLRQVDQGXVHV
WKHG\QDPLF4VWXG\WRSURGXFHSURSRVLWLRQ
.H\ZRUGV$XWRPDWHG1HJRWLDWLRQ%HOLHI5HLQIRUFHPHQW/HDUQLQJ'\QDPLF4/HDUQLQJ

 ,1752'8&7,21 LQIRUPDWLRQ WKH VHFRQG LV WR OHDUQ WKH SURSRVDO RI \RXU
QHJRWLDWLRQDGYHUVDU\WKHWKLUGLVWROHDUQWKHLQWHQWLRQRI
,Q WKH SURFHVV RI PDQ\ (%XVLQHVV QHJRWLDWLRQV \RXUQHJRWLDWLRQDGYHUVDU\WKHIRXUWKLVWROHDUQWKHDFWLRQ
QHJRWLDWRUV RI ERWK VLGHV ZLOO QRW UHYHDO WKHLU SULYDWH DQGVWUDWHJ\RIQHJRWLDWLRQDQGWKHQHJRWLDWLRQH[SHULHQFH
LQIRUPDWLRQ LQFOXGLQJ WKHLU SUHIHUHQFH UHWHQWLRQ YDOXH 7RXVHWKHSUHYLRXVVLPLODUVXFFHVVIXOQHJRWLDWLRQVWUDWHJ\
WLPHOLPLWDWLRQIRUQHJDWLRQELGVWUDWHJ\HWF$VDUHVXOW RUDYRLGXVLQJXQVXFFHVVIXOQHJRWLDWLRQSURFHVVWKHUHVXOW
QHJRWLDWLRQV DOZD\V WDNH SODFH ZLWKRXW FRPSOHWH RI QHJRWLDWLRQ FDQ EH XVHG WR YDOXH WKH TXDOLW\ RI WKH
LQIRUPDWLRQ:LWKWKHLQFUHDVLQJRIQHJRWLDWLRQTXHVWLRQV QHJRWLDWLRQ/HDUQPHFKDQLVPLQWKLVDUWLFOHFDQEHGLYLGHG
WKH QHJRWLDWLRQ HQYLURQPHQW ZLOO EHFRPH PRUH LQWR WZR SDUWV RQH LV WKH OHDUQLQJ RI WKH HQYLURQPHQW RI
FRPSOLFDWHG%HFDXVHRIWKHXQSUHGLFWDELOLW\LWLVQHFHVVDU\ FXUUHQW QHJRWLDWLRQ DQG WKH EHOLHI RI DGYHUVDU\¶V
WR H[SORLW PRUH WHQDEOH VRIWZDUH $JHQW 7KLV LV JRRG IRU LQIRUPDWLRQLW¶VWLPHWRDGRSW%D\HVLDQOHDUQLQJUHQHZLQJ
QHJRWLDWRUV WR OHDUQ WKH RWKHU VLGH¶V UHYHDOHG SULYDWH EHOLHI WKH RWKHU LV WKH OHDUQLQJ RI G\QDPLF QHJRWLDWLRQ
LQIRUPDWLRQ DQG WR FKRRVH WKH EHVW VWUDWHJ\ DFFRUGLQJ WR SURFHVV WKLV WLPH ZH XVH G\QDPLF 4/HDUQLQJ WR PDNH
QHZO\ FKDQJLQJ WKHQ WR QHJRWLDWH PRUH HIIHFWLYHO\ XQGHU SURSRVDO
FXUUHQWQHJDWLRQHQYLURQPHQW6RLWLVH[WUHPHO\LPSRUWDQW 
IRUQHJRWLDWLRQ$JHQWWRJDLQWKHDELOLW\RIRQOLQHOHDUQLQJ
 0$&+,1( /($51,1* 0(7+2'6 2)
7ROHDUQ$JHQWSHRSOHPD\FUHDWHRQHVHOIJRYHUQLQJ
OHDUQLQJ DOJRULWKP WR REWDLQ QHZ NQRZOHGJH IURP WKH $8720$7,&1(*27,$7,216<67(0
HQYLURQPHQW VR DV WR HQDEOH $JHQW WR LPSURYH WKH ,QWKHDXWRPDWLFQHJRWLDWLRQV\VWHPLWLVDQHZWRSLFLQ
LQWHOOHFWXDO FDSDELOLW\ DQG HIILFLHQF\ 7KURXJK WKH WKHRU\ UHFHQW \HDUV WR LQWURGXFH OHDUQLQJ PHFKDQLVP 7KHUH DUH
SURYLQJ ZH ILQG LI WKH $JHQW ZKR WDNHV SDUW LQ WKH WKUHH PDFKLQH OHDUQLQJ PHWKRGV LQ WKH DXWRPDWLF
LQWHUDFWLRQ FDQUHDVRQ DFFRUGLQJ WRWKH RWKHU VLGH¶V EHOLHI QHJRWLDWLRQV\VWHP
DQG UHQHZ KLV RZQ EHOLHI IXUWKHUPRUH KH FDQ OHDUQ WKH
RWKHUVLGH¶VEHKDYLRUDOVW\OHLQWKHLQWHUDFWLRQHYHQWXDOO\WR %D\HVLDQ/HDUQLQJ
PDNHPRUHSURILWGXULQJWKHSURFHVV6RQHJRWLDWLRQ$JHQW ,Q PDFKLQH OHDUQLQJ RXU LQWHUHVWV OLH LQ WKH EHVW
LQWHUDFWV E\ PHDQV RI 7ULDODQG(UURU ZLWK G\QDPLF K\SRWKHVLVGHWHUPLQDWLRQLQK\SRWKHVLVVSDFHZKHQWUDLQLQJ
QHJRWLDWLRQHQYLURQPHQW7KHUHDOL]DWLRQRIRQOLQHOHDUQLQJ GDWD ' LV JLYHQ 7KH EHVW K\SRWKHVLV UHIHUV WR WKH PRVW
LVWKHIRXQGDWLRQRIDXWRPDWHGQHJRWLDWLRQEHKDYLRU SRVVLEOHK\SRWKHVLVLQNQRZOHGJHRIWKHSULRUSUREDELOLWLHV
7KH OHDUQLQJ LQ WKH SURFHVV RI QHJRWLDWLRQ LQFOXGHV ZLWKUHVSHFWWRWKHJLYHQGDWD'DQG+LQGLIIHUHQWFDVHRI
IRXU DVSHFWV WKH ILUVW LV WR OHDUQ QHJRWLDWLRQ HQYLURQPHQW K\SRWKHVLV %D\HVLDQ WKHRU\ SURYLGHV D GLUHFW SRVVLEOH
 
7KLVZRUNLVVXSSRUWHGE\WKH3RVW'RFWRUDWH)RXQGDWLRQ
RI+HLORQJMLDQJ3URYLQFHXQGHU*UDQW/%+=



1135
978-1-4244-1734-6/08/$25.00 
c 2008 IEEE


PHWKRG RI FDOFXODWLRQ ,W LV EDVHG RQ WKH IROORZLQJ 4 V D = U V D + γ ¦ 7 V D V PD[ D
4 V
 D

K\SRWKHVLV LH WKH YDULDEOH WR EH H[DPLQHG LV VXEMHFW WR V
= 6
VRPH SUREDELOLW\ GLVWULEXWLRQ DQG WKH SUREDELOLW\ DQG

REVHUYDWLRQGDWDGHULYHWKHRSWLPDOGHFLVLRQPDNLQJ $PRQJ ZKLFK V ∈ 6 ˈ D ∈ $ ˈ V ∈ 6 ˈ V LV WKH


%DVHG RQ WKH DJHQW OHDUQLQJ LPSRUWDQFH LQ WKH DXWR QH[W VWDWH DIWHU DFWLRQ D γ LV GLVFRXQW IDFWRU  < γ <  
QHJRWLDWLRQSURFHVV=HQJDQGHWDOGHVLJQHGDFRQWLQXRXV 8QGHUFRQGLWLRQRIJLYHQUDQG7WKHILQDOYDOXH4FDQEH
GHFLVLRQPDNLQJ SURFHVVEDVHG QHJRWLDWLRQ %D]DDU PRGHO JRWWHQ %HFDXVH 9DOXH RI 4 LV WKH H[DFW VXP RI IXWXUH
ZKLFK DLPV WR VXSSRUW WKH DJHQW OHDUQLQJ LQ QHJRWLDWLRQ DZDUGV OHDUQLQJPHWKRG WDNHV YDOXH RI 4DV VXEVWLWXWH RI
SURFHVV $JHQW FDQ XSGDWH NQRZOHGJH LQ WKH LQWHUDFWLQJ VSRW DZDUG 7KH JUHDWHVW YDOXH RI 4 LQ HYHU\ WLPH LV WKH
SURFHVV VR DV WR FKRRVH WKH VWUDWHJ\ ZLWK PRUH SD\RIIV RSWLPDO VWUDWHJ\ RI 0'3 %XW LQ SUDFWLFH U DQG 7 DUH
%HFDXVHOHDUQLQJPHFKDQLVPLVLQWURGXFHGDJHQWLQ%D=DDU

PRGHO KDV PRUH QHJRWLDWLRQ FDSDELOLW\ WKDQ WKH DJHQW XQNQRZQ $JHQW FDQ RQO\ JHW NQRZOHGJH RI V D U  V 
ZLWKRXW OHDUQLQJ FDSDELOLW\ %XW WKH OHDUQLQJ LQ %D]DDU DQG OHDUQ WKH YDOXH RI 4 WKURXJK LQWHUDFWLRQ ZLWK
PRGHO LV OHDUQLQJ RI DJHQW QHJRWLDWLRQ LQ VWDWLF HQYLURQPHQW %HFDXVH DZDUG LV XVXDOO\ GHIHUUHG WLPH
HQYLURQPHQW %HFDXVH RI ODFNLQJ LQ WKH DVVRFLDWHG FUHGLW GLVWULEXWLRQ DQG VWUXFWXUDO FUHGLW GLVWULEXWLRQ IDFH
LQIRUPDWLRQLQGLIIHUHQWHQYLURQPHQWVWDWHVSDFHWKHPRGHO DJHQW OHDUQLQJ 7UDGLWLRQDO HQIRUFHPHQW OHDUQLQJ HPSOR\V
LV QRW VXLWDEOH WR WKH QHJRWLDWLRQ SUREOHPV RI G\QDPLF 7'WRVROYHWLPHFUHGLWGLVWULEXWLRQDVIROORZLQJ
HQYLURQPHQW>@ 4 V  D =  − α 4 V D + α U + γ PD[ D
4 V
 D

*HQHWLF$OJRULWKP $PRQJZKLFK α LVOHDUQLQJUDWLRULVWKHUHZDUGRI
*HQHWLF DOJRULWKP WDNHV DXWRPDWLF QHJRWLDWLRQ DV D DFWLRQDLQWKHVWDWHU7KHHQIRUFHPHQWOHDUQLQJDOJRULWKP
RSWLPDO SUREOHP LQ WKH G\QDPLFUHVHDUFKLQ DQHJRWLDWLRQ LV 4OHDUQLQJ DOJRULWKP $FFRUGLQJ WR WKH 0DUNRY WKHRU\
VSDFH FRQVLVW RI QHJRWLDWLRQ SUREOHPV DQG SRVVLEOH DQG UDQGRP DSSURDFKLQJ WKHRU\ LW LV SURYHQ WKDW
VROXWLRQ  7KURXJKWKH JLYHQ DJHQW XWLOLW\ IXQFWLRQ LW FDQ 4OHDUQLQJDOJRULWKPLVFRQYHUJHQWWRWKHRSWLPDOVROXWLRQ
UHVHDUFK IRU EHVW QHJRWLDWLRQ UHVXOWV +D\QHVDQGVHQ RI0'3
HPSOR\HG D H[SDQGHG JHQHWLF SURJUDPPLQJ²VWURQJ %XW VRPH XUJHQW SUREOHPV VWLOO IDFH 4OHDUQLQJ
PDUNLQJ SURJUDPPLQJ ZKLFK SURJUDPV PXOWLDJHQW DOJRULWKP RQH RI ZKLFK LV WKH SUREOHP RI HTXLOLEULXP
VWUDWHJ\ LQWR V\PERO H[SUHVVLRQ DQG D HVWLPDWH UXOH DQG EHWZHHQ H[SORUDWLRQ DQG H[SORLWDWLRQ 7KH H[SORLWDWLRQ
LPSURYHV JUDGXDOO\ FRRUGLQDWLYH VWUDWHJLF HIILFLHQF\ FKRRVHVRSWLPDODFWLYLW\DFFRUGLQJWRFXUUHQWVWDWHDFWLYLW\
WKURXJKILWWLQJQHVVIXQFWLRQ>@7KHUHDUHVRPHGHIHFWVLQ YDOXH /LNH WKH JHQHUDO RSWLPDO SUREOHPV DJJUHVVLYH
JHQHWLF DOJRULWKP DSSOLFDWLRQ RQH LV WKDW DJHQW VHOI H[SORLWDWLRQLVVXEMHFWWRWKHH[SRVXUHRIORFDORSWLPL]DWLRQ
EHKDYLRU NQRZOHGJH DQG VWUDWHJ\ DUH WRR FRPSOLFDWHG ZKLFKIDLOVWRJHWZKROHRSWLPDOVROXWLRQUHVHDUFKUHIHUVWR
OHDGLQJWRLQHIILFLHQWJHQHWLFSURJUDPPLQJLQSUDFWLFHWKH WKHDFWLRQSUHVHQWO\SHUFHLYHGQRWRSWLPDOZKLFKEHQHILWV
RWKHU LV WKDW GXH WR WKH ODFNLQJ LQ REYLRXV DQG VXIILFLHQW WRDJHQWPRUHNQRZOHGJHDFTXLVLWLRQVRDVWRUXQRXWRIWKH
FRPPXQLFDWLRQ PXOWLDJHQW VWUDWHJ\ IURP JHQHWLF ORFDORSWLPDOWUDSDQGILQDOO\ILQGRSWLPDOVROXWLRQ%XWWRR
DOJRULWKPFDQQRWEHDSSOLHGWRFRPSHWLWLYHHYROXWLRQDUHD PDQ\ VHDUFKHV ZLOO VORZ GRZQ OHDUQLQJ DOJRULWKP
SHUIRUPDQFHDQGEULQJVEDGHIIHFWVRQOHDUQLQJUHVXOWVLQ
(QIRUFHPHQWOHDUQLQJ>@ VRPHFDVHV
2OLYHULD DQG 5RFKD GHVLJQHG D YLVXDO PDUNHW DQG 
HPSOR\HG HQIRUFHPHQW OHDUQLQJ PHWKRG WR JHQHUDWH  %(/,()%$6(' %<1$0,& 4/($51,1*
SURSRVDO ,W HPSOR\V WUDGLWLRQDO 4OHDUQLQJ DOJRULWKP (/(&7521,& %86,1(66 $8720$7,&
ZKLFK GLG QRW WDNH G\QDPLF FKDQJH RI OHDUQLQJ 1(*27,$,216<67(0
HQYLURQPHQW LQWR FRQVLGHUDWLRQ 4OHDUQLQJ DOJRULWKP LV
RQH RI LPSRUWDQW OHDUQLQJ DOJRULWKPV LQ HQIRUFHPHQW 7KHOHDUQLQJPHFKDQLVPRIWKHSDSHUFRQVLVWVRIWZR
OHDUQLQJ DOJRULWKPV DQG D LPSRUWDQW EUHDNWKURXJK LQ DVSHFWV RQH LV OHDUQLQJ WKH HQYLURQPHQW LQ WKH FXUUHQW
HQIRUFHPHQWOHDUQLQJUHVHDUFKDVSHFWZKLFKZDVSURSRVHG QHJRWLDWLRQ VWDWH DQG EHOLHI RI ULYDO LQIRUPDWLRQ ZKLFK
E\ :DWNLQVLQ DQG WDNHQ DV HTXLYDOHQW HQIRUFHPHQW HPSOR\V %D\HVLDQ OHDUQLQJ WR XSGDWH EHOLHI WKH RWKHU LV
OHDUQLQJDOJRULWKPRIWLPHVHULHVGLIIHUHQFH OHDUQLQJ G\QDPLF SURFHVV RI QHJRWLDWLRQ ZKLFK HPSOR\V
4OHDUQLQJDOJRULWKPLVDFWXDOO\RQHRI0'3FKDQJH G\QDPLF4OHDUQLQJJHQHUDWLRQSURSRVDO
IRUPV0'3LVWDNHQDVHQIRUFHPHQWOHDUQLQJPRGHOZKLFK (YDOXDWLRQSURSRVDO
LVGHILQHGDVIROORZV
+HUHLQWKHVLPLODULW\IXQFWLRQLVHPSOR\HGWRHYDOXDWH
'HILQLWLRQ  0'3 RQH DUUD\ < 6  $ U  7 > 
SURSRVDO
ZKHUHLQ6LVUDQGRPVWDWHVSDFH$LVUDQGRPDFWLRQVSDFH 'HILQLWLRQ  7KH SURSRVDOV [ DQG \ ZLWK UHVSHFW WR
U  6 × $ → 5 LV DZDUG IXQFWLRQ RI DJHQW VLPLODULW\GHJUHHDWWULEXWHVVHWPLVGHILQHG
7  6 × $ → 3' 6 LV VWDWH WUDQVIHU IXQFWLRQ 3' LV
SUREDELOLW\ GLVWULEXWLRQ RI VWDWH VSDFH $JHQW DLPVWR ILQG
 ¦ Z VLP
P∈0
P P [ \ 
RSWLPDOVWUDWHJ\LQHYHU\UDQGRPVWDWHDQGPD[LPDOVXPRI
H[SHFWHGGLVFRXQWHGDZDUG
'HILQLWLRQ YDOXH RI 4 LV HVWLPDWH RI VWDWH DFWLRQ SDLU
DFFRUGLQJWR0'3GHILQLWLRQ


1136 2008 Chinese Control and Decision Conference (CCDC 2008)


DPRQJ ZKLFK VLPP LV WKH VLPLODULW\ GHJUHH ZLWK '\QDPLF4OHDUQLQJ


,Q0'3WKHHQYLURQPHQWDOVWDWHWUDQVIHULVGHILQHGE\
UHVSHFW WR SURSRVDO [ DQG \ DQG :P LV WKH ZHLJKW YDOXH WUDQVIHU SUREDELOLW\ IXQFWLRQ ZKLFK GRHV QRW FKDQJH ZLWK
ZLWKUHVSHFWWRDWWULEXWHPDQG
P∈0
¦Z P =   WLPH %XW LQ WKH PXOWLDJHQW HQYLURQPHQW ZLWK RQOLQH
OHDUQLQJPXOWLDJHQWVWKHEHKDYLRUVRIHYHU\DJHQWFKDQJH
,I 6LP [ \ ≥ ξ  DJHQW DFFHSW WKH FXUUHQWO\ ZLWK LWV OHDUQLQJ VLWXDWLRQ WKHUHIRUH WUDQVIHU IXQFWLRQ
FKDQJHV ZLWK WLPH DQG 0'3 LV QRW VXLWDEOH +RZHYHU
UHFHLYHG SURSRVDO DQG WKH QHJRWLDWLRQ VXFFHVVIXOO\ HQGV PDQ\0'3PRGHOEDVHGHQIRUFHPHQWOHDUQLQJPHWKRGVDUH
ξ LVWKHDFFHSWDEOHYDOYHYDOXHRIWKHSDUWLHVLQQHJRWLDWLRQ QRWFODVVLILHGDVVWDWLFHQYLURQPHQWDQGOHDUQLQJDJHQW7KH
EDVHGRQEDFNJURXQGDQGNQRZOHGJHGHILQLWLRQ SDSHU GHVLJQHG D DJHQWEDVHG G\QDPLF 4OHDUQLQJ
'HILQLWLRQ)RUD 'P JLYHQWKHVLPLODULW\GHJUHHRI DOJRULWKPZLWKUHVSHFWWRSUHVHQWQHJRWLDWLRQHQYLURQPHQW
DQG JRW WKH HVWLPDWH YDOXH 4 DFFRUGLQJ WR WKH XSGDWHG
WZRYDOXHVRI [P  \P ∈ 'P LVGHILQHG HQYLURQPHQWDO EHOLHI 7KH PHWKRG HPSOR\V WKH HVWLPDWH
− − DERXWULYDOVDQGHQYLURQPHQWDOLQIRUPDWLRQ EHOLHI QRWWKH
− − [P \ P 4 IXQFWLRQ WKHUHIRUH LW LV QRW QHFHVVDU\ WR REVHUYH UHDO
 VLPP [P  \P = FRV [ P  \ P = − −
 UHZDUGRIULYDOVDQGWKHLU4OHDUQLQJSDUDPHWHUV
[P × \ P 7KH XQLWHG VWUDWHJ\ DVHOI  DRWKHU LV HPSOR\HG WR
− − HVWLPDWH 4 YDOXH RI VWDWH DFWLRQ SDLUDPRQJ ZKLFK DVHOI 
DPRQJZKLFKYHFWRUV [ P  \ P DUHLQWKHFRPSDULVRQ
GHQRWHVDFWLRQVE\WKHDJHQWDQG DRWKHU GHQRWHVWKHDFWLRQ
VWDQGDUG VHW FRUUHVSRQGLQJ WR WKH YDOXH RI DWWULEXWH P
ZKLFKGHQRWHVLQVRPHVWDQGDUGVDWWULEXWHPLVOLDEOHWREH E\WKHULYDODJHQW
VRPH YDOXH 7KURXJK XVLQJ PDQ\ FRPSDUDWLYH VWDQGDUG ,QWKHFXUUHQWVWDWHVDJHQWHVWLPDWHVWKHDFWLRQ DRWKHU 
PHWKRGV QHJRWLDWLQJ DJHQW FDQ FRPSUHKHQVLYHO\ PHDVXUH
WKHZKROHSURSRVDOHTXLOLEULXPDWWULEXWHV RIULYDODJHQWDQGFKRRVHLWVRZQ DVHOI EDVHGRQUDQGRP
VWUDWHJ\
%HOLHI8SGDWH −
4 V DVHOI τ
%D\HVLDQ EHOLHI QHWZRUN SURYLGHV D GHVFULSWLYH H
PRGHOLQJ ODQJXDJH DOORZLQJ IOH[LEOH DQG FRQYHQLHQWO\  SUE DVHOI V = −   
SURJUDPPLQJ WR NQRZOHGJH %DVHG RQ WKLV WKH SDSHU
HPSOR\V %D\HVLDQ PHWKRGV WR XSGDWH QHJRWLDWLRQ DJHQW
¦ D ∈$ H4 VDN τ
N


NQRZOHGJH DQG EHOLHI WR HQYLURQPHQW DQG ULYDO DJHQW ,Q
DPRQJ ZKLFK 4 V DVHOI LV EDVHG RQ H[SHFWHG 4
OLJKW RI WKH FDOFXODWLRQ FRPSOH[LW\ SUREOHPV RI %D\HVLDQ
%D\HVLDQEHOLHIQHWZRUNLVHPSOR\HGWRH[SUHVVDQGXSGDWH IXQFWLRQYDOXHRIEHOLHI 3V LQVWDWHVLH
PHFKDQLVP
%HFDXVH KH SDUWLHV LQ WKH QHJRWLDWLRQ LV QRW 4 V D VHOI =< 4 V D VHOI  D RWKHU > SV 
NQRZOHGJHDEOH LQ WKH UHWDLQLQJ YDOXH DQG VWUDWHJ\ RI WKH
ULYDOVWKH\PDNHDVVXPSWLRQVDFFRUGLQJWRWKHLUNQRZOHGJH  = ³ 4 V  D
%
VHOI  D RWKHU × 3V G% 
DQG XQGHUVWDQGLQJ 7KURXJK WKH LQWHUDFWLRQV EHWZHHQ

SURSRVLQJDQGFRXQWHUSURSRVDODJHQWVRIWZRSDUWLHVFDQ $WWKHWLPHRIWHQYLURQPHQWFKDQJHVWRQHZVWDWH V 
XSGDWHWKHLUEHOLHIDFFRUGLQJWRLQWHUDFWLYHLQIRUPDWLRQDQG DQGUHZDUGYDOXH UW RIDFWLRQLVJRWWHQ$JHQWXSGDWHVWKH4
WKHLURZQNQRZOHGJHVRDVWRNQRZJUDGXDOO\WKHSURSRVDO YDOXHEDVHGRQWKHIROORZLQJIRUPXOD
VWUXFWXUH DQG VWUDWHJ\ RI WKH ULYDO DJHQW JHQHUDWLQJ WKHLU
RZQEHQHILWLQJSURSRVDO

4W V D VHOI  D RWKHU =  − α W 4W − V  D VHOI  D RWKHU
7KH JRRGV DWWULEXWHV YDOXH JLYHQ E\ WKH QHJRWLDWLQJ   
䇭䇭䇭䇭 + α W UW + γ PD[ 4W − V ′ D VHOI
′  DRWKHU

DJHQW WR ULYDO DJHQWV GHQRWHV D K\SRWKHVLV VHW G VHOI ∈ $

' M = ^GL L =   " Q`  $FFRUGLQJ WR WKH SULRU

DPRQJZKLFK D RWKHU = DUJ PD[ D


3V
 D RWKHU LV
RWKHU ∈ $
NQRZOHGJHRIDJHQWWKHUHLVDSUREDELOLW\HVWLPDWHWRHYHU\
DQ DFWLRQ H[HFXWHG E\ ULYDO DJHQW DFWXDOO\ KHUHLQ
DVVXPHG YDOXH IRUPLQJ D SUREDELOLW\ VHW 3M G L  WKH
α W ∈ >@ LV WKH OHDUQLQJ UDWLR αW DWWHQXDWHV ZLWK WLPH
SURSRVDO IURP ULYDO DJHQWV LV WDNHQ DV LQIRUPDWLRQ H
OLDEOHIRUWKHOHDUQLQJDOJRULWKPFRQYHUJHQFH
DFFRUGLQJWR WKH FXUUHQWO\ REVHUYHG NQRZOHGJH WKHUH LV D
,Q WKH DXWRPDWLF QHJRWLDWLRQ V\VWHP DJHQW KLVWRU\
SULRU FRQGLWLRQDO SUREDELOLW\ 3M H GL WR HYHU\ PD\ EHFRPH RXWGDWHG GXH WR ULYDO VWUDWHJ\ FKDQJH
K\SRWKHVLV >@ $IWHU DSSOLFDWLRQ RI %D\HVLDQ UXOH UHDU 6LPXOWDQHRXVO\ZLWKWKHDGYDQFHPHQWRIOHDUQLQJSURFHVV
SUREDELOLW\LVJHQHUDWHG WKHNQRZOHGJHJDLQHGE\DJHQWWHQGVWREHDFFXUDWH VKRZQ

3M GL 3M H G L DV WKDW 4 YDOXH LV FRQYHUJHQWWR 4  ,I D ORW RI UHVHDUFK
 3M G L H =    DFWLYLWLHV DUH FRQGXFWHG DW WKLV WLPH WKH V\VWHP
¦
Q
N =
3M H G N 3M G N SHUIRUPDQFHZLOOGHFUHDVHVXUHO\7KHUHIRUHDIWHUWKHIXOO
$FFRUGLQJWRWKHQHZSUREDELOLW\YDOXHDJHQWXSGDWHV LQWHUDFWLRQEHWZHHQDJHQWDQGHQYLURQPHQWLWLVQHFHVVDU\
EHOLHIDQGDGMXVWVWKHLURZQQHJRWLDWLRQVWUDWHJ\

 2008 Chinese Control and Decision Conference (CCDC 2008) 1137




WR UHGXFH UHVHDUFK +HUHLQ WKH ODWHVW UHVHDUFK VXUSOXV LV SURFHVV ZRXOG XQGHU SHUIRUP DV WKH WKLUG OHDUQLQJ DJHQW
LQWURGXFHGLQWR4OHDUQLQJ 7KH H[SHULPHQW UHVXOWV VWDWH WKH ILUVW DQG VHFRQG DJHQW
4W′ V D VHOI  D RWKHU

= 4W V D VHOI  D RWKHU + OHDUQLQJ FDSDELOLW\ DUH VLPLODU ZKLFK SURYHV WKDW SUHVHQW
   EHOLHIEDVHG OHDUQLQJ PHWKRG LV HIIHFWLYH PHDQZKLOH
䇭䇭䇭䇭 σλ V D VHOI  D RWKHU

ρ W V D VHOI  D RWKHU
EHFDXVH IL[HG %D\HVLDQUXOH LV HPSOR\HG WR XSGDWHUHOLHI
WKH ZKROH SURFHVV LV FRQYHUJHQW7KHUHIRUH RXU SURSRVHG
DPRQJ ZKLFK ρ LV WKH ZDLWLQJ WLPH VXUSOXV DQG DOJRULWKPLVFRQYHUJHQWGXHWRWKHDSSURSULDWHVHWXS
([SHULPHQWUHVXOWVJUDSKLVEHORZ/LQHGHQRWHVWKH
E ρ = σλ ρ ρ LVWKHUHVHDUFKVXUSOXV 7KLUG W\SH OHDUQLQJ $JHQW /LQH GHQRWHV WKH )LUVW W\SH
7KXVOHDUQLQJ DJHQW RQO\ UHVHDUFKHV WKH VWDWH ZKLFK OHDUQLQJ $JHQW DQG /LQH GHQRWHV 6HFRQG W\SH OHDUQLQJ
KDV QRW EHHQ UHVHDUFKHG DQG SUHSDUHV WR DGDSW WR DQ\ $JHQW
FKDQJHRIULYDODJHQW 
3URSRVDOJHQHUDWLRQSURFHVV 
*HQHUDOO\ ZKHQ 4OHDUQLQJ DOJRULWKP LV GHVLJQHG 
ILUVWLWVKRXOGEHWDNHQLQWRFRQVLGHUDWLRQKRZWRGHWHUPLQH ᑇ


$YHUDJHWLPHVWHSV

VWDWHVSDFH6DFWLRQVSDFH$DQGUHZDUGYDOXHU
'HILQLWLRQ6WDWHVLVGHILQHGWKHSURSRVDODFFHSWHGE\ 
DJHQW,WLVDQXQLWDU\JURXS>@ 
V = [  [ " [P "  [Q   

DPRQJZKLFK [P LVWKHYDOXHRIJRRGVDWWULEXWHPLQ 
QHJRWLDWLRQ 7KH VHW FRPSRVHG RI DOO VWDWH V LV WKH VWDWH  
VSDFH6 
/HDUQLQJWLPH
V LV WKH RSWLPDO SURSRVDO H[SHFWHG E\ QHJRWLDWLQJ 
DJHQW )LJ([SHULPHQWUHVXOWV
'HILQLWLRQ  $FWLRQ D LV GHILQHG DV YDOXH RI P RI
DJHQW FKDQJLQJ RU UHWDLQLQJ DWWULEXWHV LH WKH SURSRVDO  &RQFOXVLRQ
JLYHQE\DJHQWSUHVHQWO\7KHVHWFRPSRVHGRIDOODFWLRQDLV :LWK ZLGH DSSOLFDWLRQ LQ HOHFWURQLF EXVLQHVV WKH
DFWLRQVSDFH>@ LQWHUHVW LQ VWXG\LQJ DXWRPDWHG QHJRWLDWLRQ DQG $JHQW
'HILQLWLRQ$ZDUGYDOXHULVGHILQHGDV VRIWZDUH EHFRPHV VWURQJ 7KH JRRG TXDOLWLHV RI $JHQW
U= ¦ Z Y [
P∈0
P P  VRIWZDUHIRUH[DPSOHLWVVHOIUXOHTXDOLW\LWVVRFLDOTXDOLW\
DQGLWVLQWHOOLJHQFHFDQDSSO\WRWKHIOH[LELOLW\RIDXWRPDWHG
Y [P GHQRWHVWKHVFRULQJIXQFWLRQRIWKHYDOXH [P RI QHJRWLDWLRQV7KHPDLQLVVXHRIWKLVDUWLFOHLVWRGLVFXVVWKH
SUREOHP KRZ WR XVH RQOLQH VWXG\ V\VWHP WR LPSURYH
DWWULEXWHP%HFDXVHWKHSURSRVDOUHFHLYHGE\DJHQWLVWKH QHJRWLDWLQJ HIILFLHQF\ LQ PXOWL ±SUREOHP DXWRPDWHG
SURSRVDO JHQHUDWHG E\ ULYDO DJHQW EDVHG RQ ODVW SURSRVDO QHJRWLDWLRQV EDVHG RQ $JHQW ,W PDLQO\ DQDO\VHV %D\HVLDQ
WKH DJHQW VFRULQJ IXQFWLRQ LV HPSOR\HG WR HYDOXDWH WKH VWXG\DQGHPSKDVL]HWKHDSSOLFDWLRQRIWKLVNLQGRIVWXG\LQ
DWWULEXWH YDOXH JLYHQ E\ WKH ULYDO DJHQW DQG DWWULEXWHV QHJRWLDWLRQV0HDQZKLOHLWHQODUJHVWUDGLWLRQDO4VWXG\DQG
HYDOXDWHG YDOXH RI WKH SRSXODWLRQLV XVHGWR GHILQH DZDUG TXRWH WKH FXUUHQW EHOLHI EDVHG RQ $JHQW DQG
YDOXH G\QDPLFOHDUQLQJZKLFKLVVWXGLHGUHFHQWO\
%HOLHIEDVHGG\QDPLF4OHDUQLQJH[SHULPHQW 
+HUHLQWKUHHW\SHVRIOHDUQLQJDJHQWDUHXVHGWKHILUVW 5()(5(1&(6

W\SH LV WUDGLWLRQDO 4OHDUQLQJ DJHQW WKH VHFRQG LV
>@=KDQJ : DQG <DR /   µ,QWHOOLJHQW FRRUGLQDWLRQ
EHOLHIEDVHG 4OHDUQLQJ DJHQW DQG WKH WKLUG LV UDQGRP LQIRUPDWLRQ WHFKQLTXH¶ %HLMLQJ (OHFWURQLFV ,QGXVWU\ 3UHVV LQ
HVWLPDWHEDVHG4OHDUQLQJDJHQWZKRVHEHKDYLRUHVWLPDWH &KLQHVH 
WRWKHULYDOVLVUDQGRP >@+H < DQG &KHQ &   µ$JHQW DQG PXOWLDJHQW V\VWHP
+HUHLQSDUDPHWHULVUHIHUUHGWR α =  ˈ γ =  GHVLJQ DQG DSSOLFDWLRQ¶ :XKDQ :XKDQ 8QLYHUVLW\ 3UHVV LQ
&KLQHVH 
ˈ τ =   DOO YDOXHV RI 4 DUH LQLWLDOL]HG DV  >@/LX - HW DO   µ0XOWLLQWHOOLJHQW DJHQW SULQFLSOHV DQG
S =   WHFKQRORJ\¶%HLMLQJ4LQJKXD8QLYHUVLW\3UHVV LQ&KLQHVH 
>@<DQJ 0 -LD / DQG 4LX <   µ5HLQIRUFHPHQW
.QRZQ IURP H[SHULPHQWUHVXOWVWKH ILUVWDQG VHFRQG /HDUQLQJEDVHG PXOWLDJHQW VHOIGLVFXVVLRQ VWXG\¶ FRPSXWHU
W\SHV RI OHDUQLQJ DJHQWV SHUIRUP ZHOO ZKLFK WXUQ WR HQJLQHHULQJDQGDSSOLFDWLRQSS LQ&KLQHVH 
FRQYHUJHQFHTXLFNO\WRDGDSWWRHQYLURQPHQWZLWKWKHWLPH >@=KDQJ + DQG +XDQJ 6   µ5HLQIRUFHPHQW
LQFUHDVH RI LQWHUDFWLYH SURSRVDOV 7KH WKLUG DJHQW LV /HDUQLQJEDVHGDJHQWQHJRWLDWLRQPRGH¶FRPSXWHUHQJLQHHULQJ¶
OHDUQLQJ EXW IDLOV WR OHDUQ WKH XQLWHG EHKDYLRU RI WZR FRPSXWHUHQJLQHHULQJDQGDSSOLFDWLRQSS LQ&KLQHVH 
SDUWLHV MXVW UDQGRP EHKDYLRU FKRVHQ 7KH DOJRULWKP LV >@=KDR / DQG +RX %   µ$JHQW FRQFHSW PRGHO DQG LWV
FRQYHUJHQWZLWKUHVSHFWWRWKHILUVW4OHDUQLQJDJHQW7KHUH DSSOLFDWLRQWHFKQLTXH¶FRPSXWHUHQJLQHHULQJDQGVFLHQFHSS
LVDKLGGHQYDULDEOHLQWKHVHFRQGW\SHRIOHDUQLQJDJHQW,I  LQ&KLQHVH 
WKH HVWLPDWH RI KLGGHQ YDULDEOH LV LQDSSURSULDWH OHDUQLQJ


1138 2008 Chinese Control and Decision Conference (CCDC 2008)


>@=KDR/DQG+RX%  µ0XOWLDJHQWV\VWHPRUJDQL]DWLRQDO


VWUXFWXUH DQG FRRUGLQDWLRQ¶ FRPSXWHU HQJLQHHULQJ DQG
DSSOLFDWLRQSS LQ&KLQHVH 
>@'RQJ + DQG :DQJ -   µ0XOWLDJHQW WHFKQRORJ\
UHVHDUFK¶SS LQ&KLQHVH 
>@<DQJ 0 /X 5 DQG 4LX <   µ5HVHDUFK RQ WKH
PXOWLDJHQW VHOIGLVFXVVLRQ PDFKLQH OHDUQLQJ DSSOLFDWLRQ¶
&RPPXQLFDWLRQDQGFRPSXWHUSS LQ&KLQHVH 
>@:DQJ / *DR < DQG &KHQ 6   µ5HLQIRUFHPHQW
/HDUQLQJEDVHG DJHQW GLVFXVVLRQ PRGH LQ $2'(¶    SS
 LQ&KLQHVH 
>@/LX<&3OXHPSLWLZLUL\DZHM<6KL+/DP6<:6X
DQG+&KDQ  ³$5XOH:DUHKRXVH6\VWHPIRU.QRZOHGJH
6KDULQJDQG%XVLQHVV&ROODERUDWLRQ´7HFKQLFDO5HSRUW8)&,6(
75&,6('HSDUWPHQW8QLYHUVLW\RI)ORULGD*DLQHVYLOOH
)ORULGD$YDLODEOHDWIWSIWSFLVHXIOHGXFLVWHFKUHSRUWVWUWU
SGI
>@$OHVVLR5/RPXVFLR0LFKDHO:RROGULGJHDQG1LFKRODV5
-HQQLQJV$&ODVVLILFDWLRQ6FKHPHIRU1HJRWLDWLRQ,Q(OHFWURQLF
&RPPHUFH>-@ $JHQW 0HGLDWHG (OHFWURQLF &RPPHUFH $
(XURSHDQ3HUVSHFWLYH6SULQJHU9HUODJSDJHV
>@0LKDL%DUEXFHDQXDQG:DL.DX/R$0XOWLDWWULEXWH
8WLOLW\7KHRUHWLF1HJRWLDWLRQIRU(OHFWURQLF&RPPHUFH>-@
3URFHHGLQJV RI WKH WK LQWHUQDWLRQDO FRQIHUHQFH RQ
$XWRQRPRXVDJHQWV%DUFHORQD6SDLQSDJHV 

2008 Chinese Control and Decision Conference (CCDC 2008) 1139

You might also like