Download as pdf or txt
Download as pdf or txt
You are on page 1of 57

Ensemble

 Learning  
Ensemble  Learning  
Consider  a  set  of  classifiers  h1,  ...,  hL!
 

Idea:    construct  a  classifier  H(x)  that  combines  the  


individual  decisions  of  h1,  ...,  hL!
•  e.g.,  could  have  the  member  classifiers  vote,  or  
•  e.g.,  could  use  different  members  for  different  regions  of  the  
instance  space  
•  Works  well  if  the  members  each  have  low  error  rates  
 

Successful  ensembles  require  diversity  


•  Classifiers  should  make  different  mistakes  
•  Can  have  different  types  of  base  learners  

[Based  on  slide  by  Léon  BoFou]   4  


PracIcal  ApplicaIon:    NeMlix  Prize  
Goal:    predict  how  a  user  will  rate  a  movie  
•  Based  on  the  user’s  raIngs  for  other  movies  
•  and  other  peoples’  raIngs  
•  with  no  other  informaIon  about  the  movies  

This  applicaIon  is  called  “collaboraIve  filtering”  


 

Ne1lix  Prize:    $1M  to  the  first  team  to  do  10%  beFer  
then  NeMlix’  system  (2007-­‐2009)  
 

Winner:    BellKor’s  PragmaIc  Chaos  –  an  ensemble  of  


more  than  800  raIng  systems  
  5  
[Based  on  slide  by  Léon  BoFou]  
Combining  Classifiers:    Averaging  
h1!
h2!
x! H(x)!

hL!

•  Final  hypothesis  is  a  simple  vote  of  the  members  

6  
Combining  Classifiers:    Weighted  Average  

h1!
w1!
h2! w2!
x! H(x)!
wi!
wL!
hL!

•  Coefficients  of  individual  members  are  trained  using  


a  validaIon  set  
7  
Combining  Classifiers:    GaIng  
h1!
w1!
h2!
x! w2!
H(x)!
wi!

wL!
hL!

Gating Fn!
•  Coefficients  of  individual  members  depend  on  input  
•  Train  gaIng  funcIon  via  validaIon  set  
8  
Combining  Classifiers:    Stacking  
h1!
h2!
x! C! H(x)!

hL!
1st  Layer   2nd  Layer  

•  PredicIons  of  1st  layer  used  as  input  to  2nd  layer  
•  Train  2nd  layer  on  validaIon  set  
9  
How  to  Achieve  Diversity  

Cause  of  the  Mistake   Diversifica@on  Strategy  


PaFern  was  difficult   Hopeless  
Overfijng   Vary  the  training  sets  
Some  features  are  noisy   Vary  the  set  of  input  features  

[Based  on  slide  by  Léon  BoFou]   10  


ManipulaIng  the  Training  Data  
Bootstrap  replica@on:  
•  Given  n  training  examples,  construct  a  new  training  set  by  
sampling  n  instances  with  replacement  
•  Excludes  ~30%  of  the  training  instances  

Bagging:  
•  Create  bootstrap  replicates  of  training  set  
•  Train  a  classifier  (e.g.,  a  decision  tree)  for  each  replicate  
•  EsImate  classifier  performance  using  out-­‐of-­‐bootstrap  data  
•  Average  output  of  all  classifiers  

Boos@ng:    (in  just  a  minute...)  


[Based  on  slide  by  Léon  BoFou]   11  
ManipulaIng  the  Features  
Random  Forests  
•  Construct  decision  trees  on  bootstrap  replicas  
–  Restrict  the  node  decisions  to  a  small  subset  of  features  
picked  randomly  for  each  node  
•  Do  not  prune  the  trees  
–  EsImate  tree  performance    
on  out-­‐of-­‐bootstrap  data  
•  Average  the  output    
of  all  trees  

[Based  on  slide  by  Léon  BoFou]   12  


BoosIng  

14  
AdaBoost    
[Freund  &  Schapire,  1997]  

•  A  meta-­‐learning  algorithm  with  great  theoreIcal  and  


empirical  performance  

•  Turns  a  base  learner  (i.e.,  a  “weak  hypothesis”)  into  a  


high  performance  classifier  

•  Creates  an  ensemble  of  weak  hypotheses  by  


repeatedly  emphasizing  misspredicted  instances  

15  
thethe number of iterations iT i i=1 ⇥ ⇤
4: Compute weighted training error rate of ⇥ 1 ht : 1⇤
1: Initialize aa the
1: Initialize vector
Xnumber
vector of n
of n uniform
uniform
of iterations weights
weights T w w1 = = ⇥ n1 ,, .. .. .. ,, n1 ⇤

AdaBoost
X 1
2:
1: for t
Initialize✏=t = 1, a. . . ,
vector T wt,iof n uniform weights w 1 = n
1
, . . . , n1
2: for t ✏=t = 1, . . . , T wt,i n n
3:
2: for Train
t = 1, model
i:y.i.6= . h,tT (xh i )t on X, y with instance weights wt  
3: INPUT:Train model hii )t yon =X, y with
{(x instance
n
,i ,ni=1 weights
n
, of hw
y{(x i ,{(x
INPUT:
PUT: training training
i:y training
idata
i 6=htt (x data
X, X,
data =
X, ,yyi= )} yi )} yi )} , rate t
4:
3:
4: the Compute
Train
Compute model the
the h weighted
t on⇣
weighted X, y ⌘ training
iwith
training error
instance
i=1
error weights
i=1
rate of htt : t w :
5:
4: Choose
ComputeX the
number number
the tthe
ofnumber of ⇣
1iterationsiterations ⌘
ln 1 ✏t✏ training error
= 21weighted 1 of✏titerations
T T T ⇥ 1 rate ⇥ 1 1of ⇥⇤1 h1t :⇤ 1 ⇤
5:
Initialize
1: Initialize
alize Choose
a vectora✏tvector = a vector
of Xnt of= n2wln
uniform uniform
of n weights t
uniform
t
✏weights: weights wweights
1 = w1 n=,w , .n. .n,,n. . . , n
. .1n. =
6: Update all X instance t,i t
. t, ✏✏T
.=tt. .= wt,i t
for tfor
2:1,
=6: = 1,
. . Update ,i:y
1,
= T.i.all 6=.h,tT instance
(xiw ) weights:
i:yi 6=ht (xi ) t,i
3:Train
rain model model
Train
wt+1,i hi:y on
t model
i=
h X,
6=thw
on hyi )tX, ony(X,
with
exp with
instance instance
yt ywith
i ht (xi ))
weights
instance weights
8i w t1,w
=weights . .t. , n wt
t (xt,i ⇣
⇣(training ⌘
Compute
ompute
4: wt+1,i
the
Compute the=weighted
weighted w
the exp
training
t,i1weighted 1 ✏tt⌘ ytraining
i ht (x
error errori ))error
rate of8iht=
rate : 1,
of
rate ht.of
:. . h ,n t:
5:
5: Choose
Choose t = 1 ln ⇣ 1 ✏t ⌘
7: XNormalize X X t = wt+1 1 lnto 1be
2
2
✏ t
✏t✏ta distribution:
5:
6: Choose
Update all t w = instance ln weights:
✏7:
t = ✏t =
6: Normalize
✏t =
Update wt,iall w t,i
2w to be
instance
t+1 t,i t a distribution:
✏weights:
6: i:y 6=Update all instance wt+1,iweights:
htw
i i:y i 6=
(x
wt+1,i h
)i:y(x 6
=
t+1,i = wt,inw
i t i =
i )h tP(x i )
expt+1,i( t yyi h h8i (x=i ))))1, . .8i 8i.,n = 1, 1, .. .. .. ,, n
n
w = w P exp w ( t (x =
wt+1,i t+1,i ⇣ = t,i ⇣n⌘exp⇣
j=1 ⌘ t+1,j t i 8i
t =i 1, . .
⌘yi ht (xi )) 8i = 1, . . . , n. , n
wt+1,i = w t,i j=1
✏t11 ✏t 1 ✏t
w ( t+1,j t
5:
7:Choose
hoose t = 2
Choose
Normalize
1
t =ln 12t1ln =
w✏ 2 ln
t+1 to be
✏tto be
✏t a a distribution:
distribution:
8:
7: endNormalize
for wt t+1
7:Update
8:
pdate
6:
9: endall for
Normalize
Update
Return all the
instance instance w
allhypothesisweights:
instance to be
t+1weights: a distribution:
weights:
w t+1,i
9: Return w t+1,ithe = Pnwt+1,i
hypothesis
P 8i = = 1, 1, .. .. .. ,, n n
wt+1,i = nwt+1,i 8i !
wt+1,iwt+1,i
=w wt+1,i= w=
exp
t+1,it,i
t,i =(exp
wP (yexp
n i w
j=1
t,itj=1 hwttX(yTt+1,j
(x ih tt(x
i ))
t+1,j yiih))8i
t (x
8i = =
!i8i 1,=
))1, .. ..8i
1, .n
.. ,, n=. .1, , n. . . , n
H(x) = sign j=1 w X Tt+1,jt ht (x)
8: endNormalize
8:Normalize
ormalize
7: end for
wH(x)
for t+1wto t+1 =be wsign
to
t+1abe toat=1 at hdistribution:
distribution:
distribution:
be t (x)
9:
8: Return
end
9: Return for the the hypothesishypothesis t=1
w w w
w9: Return t+1,i t+1,i t+1,i
wt+1,i
=w P =nthe P = hypothesis
n Pn 8iT=8i 1,= . .8i
1,
.,n !. .1,
.= , n. . . , n
t+1,i
•  Size   j=1 o
t+1,i
wj=1f  
t+1,j pj=1
w oint  
t+1,jXw X T represents  
t+1,j
!
! the  instance’s  weight  
H(x) =
H(x) = sign X tt htt (x) sign T h (x)
for
end for for
8: end H(x) = sign t=1 t=1 t ht (x)
urn
Return the hypothesis
9: Return the hypothesisthe hypothesis t=1
16  
thethe number of iterations iT i i=1 ⇥ ⇤
4: Compute weighted training error rate of ⇥ 1 ht : 1⇤
1: Initialize aa the
1: Initialize vector
Xnumber
vector of n
of n uniform
uniform
of iterations weights
weights T w w1 = = ⇥ n1 ,, .. .. .. ,, n1 ⇤

AdaBoost
X 1
2:
1: for t
Initialize✏=t = 1, a. . . ,
vector T wt,iof n uniform weights w 1 = n
1
, . . . , n1
2: for t ✏=t = 1, . . . , T wt,i n n
3:
2: for Train
t = 1, model
i:y.i.6= . h,tT (xh i )t on X, y with instance weights wt  
3: INPUT:Train model hii )t yon =X, y with
{(x instance
n
,i ,ni=1 weights
n
, of hw
y{(x i ,{(x
INPUT:
PUT: training training
i:y training
idata
i 6=htt (x data
X, X,
data =
X, ,yyi= )} yi )} yi )} , rate t
4:
3:
4: the Compute
Train
Compute model the
the h weighted
t on⇣
weighted X, y ⌘ training
iwith
training error
instance
i=1
error weights
i=1
rate of htt : t w :
5:
4: Choose
ComputeX the
number number
the tthe
ofnumber of ⇣
1iterationsiterations ⌘
ln 1 ✏t✏ training error
= 21weighted 1 of✏titerations
T T T ⇥ 1 rate ⇥ 1 1of ⇥⇤1 h1t :⇤ 1 ⇤
5:
Initialize
1: Initialize
alize Choose
a vectora✏tvector = a vector
of Xnt of= n2wln
uniform uniform
of n weights t
uniform
t
✏weights: weights wweights
1 = w1 n=,w , .n. .n,,n. . . , n
. .1n. =
6: Update all instance
X
t=1
t,i t
. t, ✏✏T
.=tt. .= wt,i t
for tfor
2:1,
=6: = 1,
. . Update ,i:y
1,
= T.i.all 6=.h,tT instance
(xiw ) weights:
i:yi 6=ht (xi ) t,i
3:Train
rain model model
Train
wt+1,i hi:y on
t model
i=
h X,
6=thw
on hyi )tX, ony(X,
with
exp with
instance instance
yt ywith
i ht (xi ))
weights
instance weights
8i w t1,w
=weights . .t. , n wt
t (xt,i ⇣
⇣(training ⌘
Compute
ompute
4: wt+1,i
the
Compute the=weighted
weighted w
the exp
training
t,i1weighted 1 ✏tt⌘ ytraining
i ht (x
error errori ))error
rate of8iht=
rate : 1,
of
rate ht.of
:. . h ,n t:
5:
5: Choose
Choose t = 1 ln ⇣ 1 ✏t ⌘
7: XNormalize X X t = wt+1 1 lnto 1be
2
2
✏ t
✏t✏ta distribution:
5:
6: Choose
Update all t w = instance ln weights:
✏7:
t = ✏t =
6: Normalize
✏t =
Update wt,iall w t,i
2w to be
instance
t+1 t,i t a distribution:
✏weights:
6: i:y 6=Update all instance wt+1,iweights:
htw
i i:y i 6=
(x
wt+1,i h
)i:y(x 6
=
t+1,i = wt,inw
i t i =
i )h tP(x i )
expt+1,i( t yyi h h8i (x=i ))))1, . .8i 8i.,n = 1, 1, .. .. .. ,, n
n
w = w P exp w ( t (x =
wt+1,i t+1,i ⇣ = t,i ⇣n⌘exp⇣
j=1 ⌘ t+1,j t i 8i
t =i 1, . .
⌘yi ht (xi )) 8i = 1, . . . , n. , n
wt+1,i = w t,i j=1
✏t11 ✏t 1 ✏t
w ( t+1,j t
5:
7:Choose
hoose t = 2
Choose
Normalize
1
t =ln 12t1ln =
w✏ 2 ln
t+1 to be
✏tto be
✏t a a distribution:
distribution:
8:
7: endNormalize
for wt t+1
7:Update
8:
pdate
6:
9: endall for
Normalize
Update
Return all the
instance instance w
allhypothesisweights:
instance to be
t+1weights: a distribution:
weights:
w t+1,i
9: Return w t+1,ithe = Pnwt+1,i
hypothesis
P 8i = = 1, 1, .. .. .. ,, n n
wt+1,i = nwt+1,i 8i !
wt+1,iwt+1,i
=w wt+1,i= w=
exp
t+1,it,i
t,i =(exp
wP (yexp
n i w
j=1
t,itj=1 hwttX(yTt+1,j
(x ih tt(x
i ))
t+1,j yiih))8i
t (x
8i = =
!i8i 1,=
))1, .. ..8i
1, .n
.. ,, n=. .1, , n. . . , n
H(x) = sign j=1 w X Tt+1,jt ht (x)
8: endNormalize
8:Normalize
ormalize
7: end for
wH(x)
for t+1wto t+1 =be wsign
tot+1abe toat=1 at hdistribution:
distribution:
distribution:
be t (x)
9:
8: Return
end
9: Return for the the hypothesishypothesis t=1
w w w
w9: Return t+1,i t+1,i t+1,i
wt+1,i
t+1,i =w P =nthe
t+1,i P = hypothesis
n Pn 8iT=8i 1,= . .8i
1,
.,n !. .1,
.=
! , n. . . , n
w w
t+1,jj=1 t+1,jXw X Tt+1,j
j=1 j=1
H(x) = sign T h (x) !
H(x) = sign X t ht (x) t t
for
end for for
8: end H(x) = sign t=1 t=1 t ht (x)
urn
Return the hypothesis
9: Return the hypothesisthe hypothesis t=1
17  
thethe number of iterations iT i i=1 ⇥ ⇤
4: Compute weighted training error rate of ⇥ 1 ht : 1⇤
1: Initialize aa the
1: Initialize vector
Xnumber
vector of n
of n uniform
uniform
of iterations weights
weights T w w1 = = ⇥ n1 ,, .. .. .. ,, n1 ⇤

AdaBoost
X 1
2:
1: for t
Initialize✏=t = 1, a. . . ,
vector T wt,iof n uniform weights w 1 = n
1
, . . . , n1
2: for t ✏=t = 1, . . . , T wt,i n n
3:
2: for Train
t = 1, model
i:y.i.6= . h,tT (xh i )t on X, y with instance weights wt  
3: INPUT:Train model hii )t yon =X, y with
{(x instance
n
,i ,ni=1 weights
n
, of hw
y{(x i ,{(x
INPUT:
PUT: training training
i:y training
idata
i 6=htt (x data
X, X,
data =
X, ,yyi= )} yi )} yi )} , rate t
4:
3:
4: the Compute
Train
Compute model the
the h weighted
t on⇣
weighted X, y ⌘ training
iwith
training error
instance
i=1
error weights
i=1
rate of htt : t w :
5:
4: Choose
ComputeX the
number number
the tthe
ofnumber of ⇣
1iterationsiterations ⌘
ln 1 ✏t✏ training error
= 21weighted 1 of✏titerations
T T T ⇥ 1 rate ⇥ 1 1of ⇥⇤1 h1t :⇤ 1 ⇤
5:
Initialize Choose
1: Initialize
alize a vectora✏tvector a vector
of Xnt of= n2wln
uniform uniform
of n weights t
uniform weights wweights
1 = w1 n=,w , .n. .n,,n. . . , n
. .1n. =
+ –
t
6: Update = all instance ✏weights:
X
t=1
t,i t
. t, ✏✏T
.=tt. .= wt,i t
for tfor
2:1,
=6: = 1,
. . Update ,i:y
1,
= T.i.all 6=.h,tT instance
(xiw ) weights:
i:yi 6=ht (xi ) t,i
3:Train
rain model model
Train
wt+1,i hi:y on
t model
i=
h X,
6=thw
on hyi )tX, ony(X,
with
exp with
instance instance
yt ywith
i ht (xi ))
weights
instance weights
8i w t1,w
=weights . .t. , n wt
t (xt,i ⇣
⇣(training ⌘
Compute
ompute
4: wt+1,i
the
Compute the=weighted
weighted w
the exp
training
t,i1weighted 1 ✏tt⌘ ytraining
i ht (x
error errori ))error
rate of8iht=
rate : 1,
of
rate ht.of
:. . h ,n t:
5:
5: Choose
Choose t = 1 ln ⇣ 1 ✏t ⌘
7: XNormalize X X t = wt+1 1 lnto 1be
2
2
✏ t
✏t✏ta distribution:
5:
6: Choose
Update all t w = instance ln weights:
✏7:
t = ✏t =
6: Normalize
✏t =
Update wt,iall w t,i
2w to be
instance
t+1 t,i t a distribution:
✏weights:
6: i:y 6=Update all instance wt+1,iweights:
htw
i i:y i 6=
(x
wt+1,i h
)i:y(x 6
=
t+1,i = wt,inw
i t i =
i )h tP(x i )
expt+1,i( t yyi h h8i (x=i ))))1, . .8i 8i.,n = 1, 1, .. .. .. ,, n
n
w = w P exp w ( t (x =
wt+1,i t+1,i ⇣ = t,i ⇣n⌘exp⇣
j=1 ⌘ t+1,j t i 8i
t =i 1, . .
⌘yi ht (xi )) 8i = 1, . . . , n. , n
wt+1,i = w t,i j=1
✏t11 ✏t 1 ✏t
w ( t+1,j t
5:
7:Choose
hoose t = 2
Choose
Normalize
1
t =ln 12t1ln =
w✏ 2 ln
t+1 to be
✏tto be
✏t a a distribution:
distribution:
8:
7: endNormalize
for wt t+1
7:Update
8:
pdate
6:
9: endall for
Normalize
Update
Return all the
instance instance w
allhypothesisweights:
instance to be
t+1weights: a distribution:
weights:
w t+1,i
9: Return w t+1,ithe = Pnwt+1,i
hypothesis
P 8i = = 1, 1, .. .. .. ,, n n
wt+1,i = nwt+1,i 8i !
wt+1,iwt+1,i
=w wt+1,i= w=
exp
t+1,it,i
t,i =(exp
wP (yexp
n i w
j=1
t,itj=1 hwttX(yTt+1,j
(x ih tt(x
i ))
t+1,j yiih))8i
t (x
8i = =
!i8i 1,=
))1, .. ..8i
1, .n
.. ,, n=. .1, , n. . . , n
H(x) = sign j=1 w X Tt+1,jt ht (x)
8: endNormalize
8:Normalize
ormalize
7: end for
wH(x)
for t+1wto t+1 =be wsign
tot+1abe toat=1 at hdistribution:
distribution:
distribution:
be t (x)
9:
8: Return
end
9: Return for the the hypothesishypothesis t=1
w w w
w9: Return t+1,i t+1,i t+1,i
wt+1,i
t+1,i =w P =nthe
t+1,i P = hypothesis
n Pn 8iT=8i 1,= . .8i
1,
.,n !. .1,
.=
! , n. . . , n
w w
t+1,jj=1 t+1,jXw X Tt+1,j
j=1 j=1
H(x) = sign T h (x) !
H(x) = sign X t ht (x) t t
for
end for for
8: end H(x) = sign t=1 t=1 t ht (x)
urn
Return the hypothesis
9: Return the hypothesisthe hypothesis t=1
18  
thethe number of iterations iT i i=1 ⇥ ⇤
4: Compute weighted training error rate of ⇥ 1 ht : 1⇤
1: Initialize aa the
1: Initialize vector
Xnumber
vector of n
of n uniform
uniform
of iterations weights
weights T w w1 = = ⇥ n1 ,, .. .. .. ,, n1 ⇤

AdaBoost
X 1
2:
1: for t
Initialize✏=t = 1, a . . . ,
vector T wt,i of n uniform weights w 1 = n
1
, . . . , n1
2: for t ✏=t = 1, . . . , T wt,i n n
3:
2: for Train
t = 1, i:y model. i.6= . h,tT (xh i )t on X, y with instance weights wt  
3: INPUT:Train model hii )t yon =X, y with
{(x ninstance ,i ,ni=1 weights
n
, of hw
y{(x i ,{(x
INPUT:
PUT: training trainingi:y training
idata
i 6=htt (x data X, X,
data =
X, ,yyi= )} yi )} yi )} , rate t
4:
3:
4: the Compute
Train
Compute model the
the h weighted
t
weighted on ⇣ X, y ⌘ training
iwith
training instance
i=1 error
error weights
i=1
rate of htt : t w :
5:
4: Choose
ComputeX the
number number
the tthe
of number
= 21weighted of
1iterations ⇣ iterations ⌘
1 of✏titerations
ln 1 ✏t✏ training error T T T ⇥ 1 rate ⇥ 1 1of ⇥⇤1 h1t :⇤ 1 ⇤
5:
Initialize Choose
1: Initialize
alize a vectora✏tvector a vector
of Xnt of =uniform n2wln uniform
of n weights t
uniform weights w weights1 = w1 n=,w . .1n. = , .n. .n,,n. . . , n
+ –
t
6: Update = all instance ✏weights:
X
t=1
t,i t
. t, ✏✏T
.=tt. .= wt,i t
for tfor
2:1,
=6: = 1,
. . Update =1,,i:y T.i.all 6=.h,tT instance
(xiw ) weights:
i:yi 6=ht (xi ) t,i
3:Train
rain model model
Train
wt+1,i hi:y t model oni=
h X,
6=thw
on h yi )tX,exp ony(X,
with with
instance instance
yt ywith
i ht (xi ))
weights
instance weights8i w =weightst1,w . .t. , n wt
t (x t,i ⇣
⇣(training ⌘
Compute
ompute
4: wt+1,i
the
Compute the=weighted
weighted thew t,i1weighted exp
training 1 ✏tt⌘ ytraining
i ht (x
error error rate i ))error of8iht=
rate of : 1,
rate ht.of
:. . h ,n t:
5:
5: Choose
Choose t = 1 ln ⇣ 1 ✏t ⌘
7: XNormalize X X t = wt+1 1 lnto 1be
2
2
✏ t
✏t✏ta distribution:
5:
6: Choose
Update all t w = instance ln weights:
✏7:
t = ✏t =
6: Normalize
✏t =
Update wt,iall w t,i instance
t+1 2w to be
t,i t a distribution:
✏weights:
6: i:y 6=Update all instance wt+1,iweights:
htw
i i:y i 6=
(x
wt+1,i h) i:y
t+1,i = wt,inw
i t (x i 6
= =
i )h t P(x i )
exp t+1,i ( t yyi h h8i (x=i )) ))1, . .8i 8i .,n = 1, 1, .. .. .. ,, n
n
w = w P exp w ( t (x =
wt+1,i t+1,i ⇣ = t,i ⇣n⌘exp⇣j=1 ⌘ t+1,j t i 8i
t = i
⌘yi ht (xi )) 8i = 1, . . . , n 1, . . . , n
wt+1,i = w t,i j=1
✏t11 ✏t 1 ✏t
w ( t+1,j t
5:
7:Choose
hoose t = 2
Choose
Normalize
1
t = ln 12t1ln =
w ✏ t+1 2 ln ✏tto to be be
✏t a a distribution:
distribution:
8:
7: endNormalize
for wt t+1
7:Update
8:
pdate
6:
9: endall for
Normalize
Update
Return all the
instance instance allhypothesis w t+1weights:
weights:
instance to be a distribution:
weights:
w t+1,i
9: Return w t+1,i the = Pnwt+1,i
hypothesis
P 8i = = 1, 1, .. .. .. ,, n n
wt+1,i = nwt+1,i 8i !
wt+1,iwt+1,i
=w wt+1,i=t+1,it,i
t,i w=
exp =(exp wP t,itj=1 (yexp
n i w
j=1 hw ttX(yTt+1,j
(x ih tt(x
i ))
t+1,j yiih))8i
t (x
8i ==
! i8i))1, 1,= .. ..8i 1,
.. ,, n .n
=. .1, , n. . . , n
H(x) = sign j=1 w X Tt+1,jt ht (x)
8: endNormalize
8:Normalize
ormalize
7: end for
wH(x)
for t+1wto t+1 =be wsign to t+1 abe toat=1
distribution: at hdistribution:
distribution:
be t (x)
9:
8: Return
end
9: Return for the the hypothesis hypothesis t=1
w w w
w9: Return t+1,i t+1,i t+1,i
wt+1,i
=w P =nthe P = hypothesis
n Pn 8iT=8i 1,= . .8i
1,
.,n .=!. .1, , n. . . , n
t+1,i
•  βt  j=1 t+1,i
measures   wj=1 t+1,jj=1 w t+1,jX w X T the  importance  
t+1,j
!
! of  ht!
H(x)
H(x) = sign X tt htt (x) = sign T h (x)
for
end •  If  
for
8: end for    ✏  t      
H(x)      sign
=    0.5
         ,t=1
 then  
t=1 t ht (x)        t              0            (can  trivially  guarantee)    
urn
Return the hypothesis
9: Return the hypothesis the hypothesis t=1
19  
thethe number of iterations iT i i=1 ⇥ ⇤
4: Compute weighted training error rate of ⇥ 1 ht : 1⇤
1: Initialize aa the
1: Initialize vector
Xnumber
vector of n
of n uniform
uniform
of iterations weights
weights T w w1 = = ⇥ n1 ,, .. .. .. ,, n1 ⇤

AdaBoost
X 1
2:
1: for t
Initialize✏=t = 1, a . . . ,
vector T wt,i of n uniform weights w 1 = n
1
, . . . , n
1
2: for t ✏=t = 1, . . . , T wt,i n n
3:
2: for Train
t = 1, i:y model. i.6= . h,tT (xh i )t on X, y with instance weights wt  
3: INPUT:Train model hii )t yon =X, y with
{(x ninstance ,i ,ni=1 weights
n
, of hw
y{(x i ,{(x
INPUT:
PUT: training trainingi:y training
idata
i 6=htt (x data X, X,
data =
X, ,yyi= )} yi )} yi )} , rate t
4:
3:
4: the Compute
Train
Compute model the
the h weighted
t
weighted on ⇣ X, y ⌘ training
iwith
training instance
i=1 error
error weights
i=1
rate of htt : t w :
5:
4: Choose
ComputeX the
number number
the tthe
of number
= 21weighted of
1iterations ⇣ iterations ⌘
1 of✏titerations
ln 1 ✏t✏ training error T T T ⇥ 1 rate ⇥ 1 1of ⇥⇤1 h1t :⇤ 1 ⇤
5:
Initialize Choose
1: Initialize
alize a vectora✏tvector a vector
of Xnt of =uniform n2wln uniform
of n weights t
uniform weights w weights1 = w1 n=,w . .1n. = , .n. .n,,n. . . , n
+ –
t
6: Update = all instance ✏weights:
X
t=1
t,i t
. t, ✏✏T
.=tt. .= wt,i t
for tfor
2:1,
=6: = 1,
. . Update =1,,i:y T.i.all 6=.h,tT instance
(xiw ) weights:
i:yi 6=ht (xi ) t,i
3:Train
rain model model
Train
wt+1,i hi:y t model oni=
h X,
6=thw
on h yi )tX,exp ony(X,
with with
instance instance
yt ywith
i ht (xi ))
weights
instance weights8i w =weightst1,w . .t. , n wt
t (x t,i ⇣
⇣(training ⌘
Compute
ompute
4: wt+1,i
the
Compute the=weighted
weighted thew t,i1weighted exp
training 1 ✏tt⌘ ytraining
i ht (x
error error rate i ))error of8iht=
rate of : 1,
rate ht.of :. . h,n t:
5:
5: Choose
Choose t = 1 ln ⇣ 1 ✏t ⌘
7: XNormalize X X t = wt+1 1 lnto 1be
2
2
✏ t
✏t✏ta distribution:
5:
6: Choose
Update all t w = instance ln weights:
✏7:
t = ✏t =
6: Normalize
✏t =
Update wt,iall w t,i instance
t+1 2w to be
t,i t a distribution:
✏weights:
6: i:y 6=Update all instance wt+1,iweights:
htw
i i:y i 6=
(x
wt+1,i h) i:y
t+1,i = wt,inw
i t (x i 6
= =
i )h t P(x i )
exp t+1,i ( t yyi h h8i (x=i )) ))1, . .8i 8i .,n = 1, 1, .. .. .. ,, n n
w = w P exp w ( t (x =
wt+1,i t+1,i ⇣ = t,i ⇣n⌘exp⇣j=1 ⌘ t+1,j t i 8i
t = i
⌘yi ht (xi )) 8i = 1, . . . , n 1, . . . , n
wt+1,i = w t,i j=1
✏t11 ✏t 1 ✏t
w ( t+1,j t
5:
7:Choose
hoose t = 2
Choose
Normalize
1
t = ln 12t1ln =
w ✏ t+1 2 ln ✏tto to be be
✏t a a distribution:
distribution:
8:
7: endNormalize
for wt t+1
7:Update
8:
pdate
6:
9: endall for
Normalize
Update
Return all the
instance instance allhypothesis w t+1weights:
weights:
instance to be a distribution:
weights:
w t+1,i
9: Return w t+1,i the = Pnwt+1,i
hypothesis
P 8i = = 1, 1, .. .. .. ,, n n
wt+1,i = nwt+1,i 8i !
wt+1,iwt+1,i
=w wt+1,i=t+1,it,i
t,i w=
exp =(exp wP t,itj=1 (yexp
n i w
j=1 hw ttX(yTt+1,j
(x ih tt(x
i ))
t+1,j yiih))8i
t (x
8i ==
! i8i))1, 1,= .. ..8i 1,
.. ,, n .n
=. .1, , n. . . , n
H(x) = sign j=1 w X Tt+1,jt ht (x)
8: endNormalize
8:Normalize
ormalize
7: end for
wH(x)
for t+1wto t+1 =be wsign to t+1 abe toat=1
distribution: at hdistribution:
distribution:
be t (x)
9:
8: Return
end
9: Return for the the hypothesis hypothesis t=1
w w w
w9: Return t+1,i t+1,i t+1,i
wt+1,i
=w P =nthe P = hypothesis
n Pn 8iT=8i 1,= . .8i
1,
.,n .=!. .1, , n. . . , n
t+1,i
•  βt  j=1 t+1,i
measures   wj=1 t+1,jj=1 w t+1,jX w X T the  importance  
t+1,j
!
! of  ht!
H(x)
H(x) = sign X tt htt (x) = sign T h (x)
for
end •  If  
for
8: end for    ✏  t      
H(x)      sign
=    0.5
         ,t=1
 then  
t=1 t ht (x)        t              0            (  βt    grows  as      ✏  t      
gets  
0.5smaller)    
urn
Return the hypothesis
9: Return the hypothesis the hypothesis t=1
20  
thethe number of iterations iT i i=1 ⇥ ⇤
4: Compute weighted training error rate of ⇥ 1 ht : 1⇤
1: Initialize aa the
1: Initialize vector
Xnumber
vector of n
of n uniform
uniform
of iterations weights
weights T w w1 = = ⇥ n1 ,, .. .. .. ,, n1 ⇤

AdaBoost
X 1
2:
1: for t
Initialize✏=t = 1, a. . . ,
vector T wt,iof n uniform weights w 1 = n
1
, . . . , n1
2: for t ✏=t = 1, . . . , T wt,i n n
3:
2: for Train
t = 1, model
i:y.i.6= . h,tT (xh i )t on X, y with instance weights wt  
3: INPUT:Train model hii )t yon =X, y with
{(x instance
n
,i ,ni=1 weights
n
, of hw
y{(x i ,{(x
INPUT:
PUT: training training
i:y training
idata
i 6=htt (x data
X, X,
data =
X, ,yyi= )} yi )} yi )} , rate t
4:
3:
4: the Compute
Train
Compute model the
the h weighted
t on
weighted ⇣ X, y ⌘ training
iwith
training error
instance
i=1
error weights
i=1
rate of htt : t w :
5:
4: Choose
ComputeX the
number number
the tthe
ofnumber of
1iterations⇣ iterations ⌘
ln 1 ✏t✏ training error
= 21weighted 1 of✏titerations
T T T ⇥ 1 rate ⇥ 1 1of ⇥⇤1 h1t :⇤ 1 ⇤
5:
Initialize Choose
1: Initialize
alize a vectora✏tvector a vector
of Xnt of= n2wln
uniform uniform
of n weights t
uniform weights wweights
1 = w1 n=,w , .n. .n,,n. . . , n
. .1n. =
+ –
t
6: Update = all instance ✏weights:
X
t=1
t,i t
. t, ✏✏T
.=tt. .= wt,i t
for tfor
2:1,
=6: = 1,
. . Update ,i:y
1,
= T.i.all 6=.h,tT instance
(xiw ) weights:
i:yi 6=ht (xi ) t,i
3:Train
rain model model
Train
wt+1,i hi:y on
t model
i=
h X,
6=thw
on hyi )tX, ony(X,
with
exp with
instance instance
yt ywith
i ht (xi ))
weights
instance weights
8i w t1,w
=weights . .t. , n wt
t (xt,i ⇣
⇣(training ⌘
Compute
ompute
4: wt+1,i
the
Compute the=weighted
weighted w
the exp
training
t,i1weighted 1 ✏tt⌘ ytraining
i ht (x
error errori ))error
rate of8iht=
rate : 1,
of
rate ht.of
:. . h ,n t:
5:
5: Choose
Choose t = 1 ln ⇣ 1 ✏t ⌘
7: XNormalize X X t = wt+1 1 lnto 1be
2
2
✏ t
✏t✏ta distribution:
5:
6: Choose
Update all t w = instance ln weights:
✏7:
t = ✏t =
6: Normalize
✏t =
Update wt,iall w t,i
2w to be
instance
t+1 t,i t a distribution:
✏weights:
6: i:y 6=Update all instance wt+1,iweights:
htw
i i:y i 6=
(x
wt+1,i h
)i:y(x 6
=
t+1,i = wt,inw
i t i =
i )h tP(x i )
expt+1,i ( t yyi h h8i (x=i ))))1, . .8i 8i.,n = 1, 1, .. .. .. ,, n
n
w = w P exp w ( t (x =
wt+1,i t+1,i ⇣ = t,i ⇣n⌘exp⇣
j=1 ⌘ t+1,j t i 8i
t =i 1, . .
⌘yi ht (xi )) 8i = 1, . . . , n. , n
wt+1,i = w t,i j=1
✏t11 ✏t 1 ✏t
w ( t+1,j t
5:
7:Choose
hoose t = 2
Choose
Normalize
1
t =ln 12t1ln =
w✏ 2 ln
t+1 to be
✏tto be
✏t a a distribution:
distribution:
8:
7: endNormalize
for wt t+1
7:Update
8:
pdate
6:
9: endall for
Normalize
Update
Return all the
instance instance w
allhypothesisweights:
instance to be
t+1weights: a distribution:
weights:
w t+1,i
9: Return w t+1,ithe = Pnwt+1,i
hypothesis
P 8i = = 1, 1, .. .. .. ,, n n
wt+1,i = nwt+1,i 8i !
wt+1,iwt+1,i
=w wt+1,i= w=
exp
t+1,it,i
t,i =(exp
wP (yexp
n i w
j=1
t,itj=1 hw ttX(yTt+1,j
(x ih tt(x
i ))
t+1,j yiih))8i
t (x
8i = =
!i8i 1,=
))1, .. ..8i
1, .n
.. ,, n=. .1, , n. . . , n
H(x) = sign j=1 w X Tt+1,jt ht (x)
8: endNormalize
8:Normalize
ormalize
7: end for
wH(x)
for t+1wto t+1 =be wsign
tot+1abe toat=1 at hdistribution:
distribution:
distribution:
be t (x)
9:
8: Return
end
9: Return for the the hypothesishypothesis t=1
w w w
w9: Return t+1,i t+1,i t+1,i
wt+1,i
=w P =nthe P = hypothesis
n Pn 8iT=8i 1,= . .8i
1,
.,n !. .1,
.= , n. . . , n
t+1,i
•  Weights   t+1,i
j=1 j=1 w w
t+1,jj=1 o f  
t+1,jX c
w Xorrect  
Tt+1,j p redicIons  
!
! are  mulIplied  by   e t
1
H(x) =
H(x) = sign sign X T t ht (x)
t ht (x)
for
end •  Weights  
for
8: end for H(x) = sign of  incorrect  
t=1
t=1 t t
h (x) predicIons  are  mulIplied  by    e
t
1
urn
Return the hypothesis
9: Return the hypothesisthe hypothesis t=1
21  
thethe number of iterations iT i i=1 ⇥ ⇤
4: Compute weighted training error rate of ⇥ 1 ht : 1⇤
1: Initialize aa the
1: Initialize vector
Xnumber
vector of n
of n uniform
uniform
of iterations weights
weights T w w1 = = ⇥ n1 ,, .. .. .. ,, n1 ⇤

AdaBoost
X 1
2:
1: for t
Initialize✏=t = 1, a. . . ,
vector T wt,iof n uniform weights w 1 = n
1
, . . . , n1
2: for t ✏=t = 1, . . . , T wt,i n n
3:
2: for Train
t = 1, model
i:y.i.6= . h,tT (xh i )t on X, y with instance weights wt  
3: INPUT:Train model hii )t yon =X, y with
{(x instance
n
,i ,ni=1 weights
n
, of hw
y{(x i ,{(x
INPUT:
PUT: training training
i:y training
idata
i 6=htt (x data
X, X,
data =
X, ,yyi= )} yi )} yi )} , rate t
4:
3:
4: the Compute
Train
Compute model the
the h weighted
t on
weighted ⇣ X, y ⌘ training
iwith
training error
instance
i=1
error weights
i=1
rate of htt : t w :
5:
4: Choose
ComputeX the
number number
the tthe
ofnumber of
1iterations⇣ iterations ⌘
ln 1 ✏t✏ training error
= 21weighted 1 of✏titerations
T T T ⇥ 1 rate ⇥ 1 1of ⇥⇤1 h1t :⇤ 1 ⇤
5:
Initialize Choose
1: Initialize
alize a vectora✏tvector a vector
of Xnt of= n2wln
uniform uniform
of n weights t
uniform weights wweights
1 = w1 n=,w , .n. .n,,n. . . , n
. .1n. =
+ –
t
6: Update = all instance ✏weights:
X
t=1
t,i t
. t, ✏✏T
.=tt. .= wt,i t
for tfor
2:1,
=6: = 1,
. . Update ,i:y
1,
= T.i.all 6=.h,tT instance
(xiw ) weights:
i:yi 6=ht (xi ) t,i
3:Train
rain model model
Train
wt+1,i hi:y on
t model
i=
h X,
6=thw
on hyi )tX, ony(X,
with
exp with
instance instance
yt ywith
i ht (xi ))
weights
instance weights
8i w t1,w
=weights . .t. , n wt
t (xt,i ⇣
⇣(training ⌘
Compute
ompute
4: wt+1,i
the
Compute the=weighted
weighted w
the exp
training
t,i1weighted 1 ✏tt⌘ ytraining
i ht (x
error errori ))error
rate of8iht=
rate : 1,
of
rate ht.of
:. . h ,n t:
5:
5: Choose
Choose t = 1 ln ⇣ 1 ✏t ⌘
7: XNormalize X X t = wt+1 1 lnto 1be
2
2
✏ t
✏t✏ta distribution:
5:
6: Choose
Update all t w = instance ln weights:
✏7:
t = ✏t =
6: Normalize
✏t =
Update wt,iall w t,i
2w to be
instance
t+1 t,i t a distribution:
✏weights:
6: i:y 6=Update all instance wt+1,iweights:
htw
i i:y i 6=
(x
wt+1,i h
)i:y(x 6
=
t+1,i = wt,inw
i t i =
i )h tP(x i )
expt+1,i ( t yyi h h8i (x=i ))))1, . .8i 8i.,n = 1, 1, .. .. .. ,, n
n
w = w P exp w ( t (x =
wt+1,i t+1,i ⇣ = t,i ⇣n⌘exp⇣
j=1 ⌘ t+1,j t i 8i
t =i 1, . .
⌘yi ht (xi )) 8i = 1, . . . , n. , n
wt+1,i = w t,i j=1
✏t11 ✏t 1 ✏t
w ( t+1,j t
5:
7:Choose
hoose t = 2
Choose
Normalize
1
t =ln 12t1ln =
w✏ 2 ln
t+1 to be
✏tto be
✏t a a distribution:
distribution:
8:
7: endNormalize
for wt t+1
7:Update
8:
pdate
6:
9: endall for
Normalize
Update
Return all the
instance instance w
allhypothesisweights:
instance to be
t+1weights: a distribution:
weights:
w t+1,i
9: Return w t+1,ithe = Pnwt+1,i
hypothesis
P 8i = = 1, 1, .. .. .. ,, n n
wt+1,i = nwt+1,i 8i !
wt+1,iwt+1,i
=w wt+1,i= w=
exp
t+1,it,i
t,i =(exp
wP (yexp
n i w
j=1
t,itj=1 hw ttX(yTt+1,j
(x ih tt(x
i ))
t+1,j yiih))8i
t (x
8i = =
!i8i 1,=
))1, .. ..8i
1, .n
.. ,, n=. .1, , n. . . , n
H(x) = sign j=1 w X Tt+1,jt ht (x)
8: endNormalize
8:Normalize
ormalize
7: end for
wH(x)
for t+1wto t+1 =be wsign
tot+1abe toat=1 at hdistribution:
distribution:
distribution:
be t (x)
9:
8: Return
end
9: Return for the the hypothesishypothesis t=1
w w w
w9: Return t+1,i t+1,i t+1,i
wt+1,i
=w P =nthe P = hypothesis
n Pn 8iT=8i 1,= . .8i
1,
.,n !. .1,
.= , n. . . , n
t+1,i
•  Weights   t+1,i
j=1 j=1 w w
t+1,jj=1 o f  
t+1,jX c
w Xorrect  
Tt+1,j p redicIons  
!
! are  mulIplied  by   e t
1
H(x) =
H(x) = sign sign X T t ht (x)
t ht (x)
for
end •  Weights  
for
8: end for H(x) = sign of  incorrect  
t=1
t=1 t t
h (x) predicIons  are  mulIplied  by    e
t
1
urn
Return the hypothesis
9: Return the hypothesisthe hypothesis t=1
22  
thethe number of iterations iT i i=1 ⇥ ⇤
4: Compute weighted training error rate of ⇥ 1 ht : 1⇤
1: Initialize aa the
1: Initialize vector
Xnumber
vector of n
of n uniform
uniform
of iterations weights
weights T w w1 = = ⇥ n1 ,, .. .. .. ,, n1 ⇤

AdaBoost
X 1
2:
1: for t
Initialize✏=t = 1, a. . . ,
vector T wt,iof n uniform weights w 1 = n
1
, . . . , n1
2: for t ✏=t = 1, . . . , T wt,i n n
3:
2: for Train
t = 1, model
i:y.i.6= . h,tT (xh i )t on X, y with instance weights wt  
3: INPUT:Train model hii )t yon =X, y with
{(x instance
n
,i ,ni=1 weights
n
, of hw
y{(x i ,{(x
INPUT:
PUT: training training
i:y training
idata
i 6=htt (x data
X, X,
data =
X, ,yyi= )} yi )} yi )} , rate t
4:
3:
4: the Compute
Train
Compute model the
the h weighted
t on⇣
weighted X, y ⌘ training
iwith
training error
instance
i=1
error weights
i=1
rate of htt : t w :
5:
4: Choose
ComputeX the
number number
the tthe
ofnumber
= 21weighted of ⇣
1iterationsiterations ⌘
1 of✏titerations
ln 1 ✏t✏ training errorT T T ⇥ 1 rate ⇥ 1 1of ⇥⇤1 h1t :⇤ 1 ⇤
5:
Initialize Choose
1: Initialize
alize a vectora✏tvector a vector
of Xnt of=uniformn2wln uniform
of n weights t
uniform weights wweights
1 = w1 n=,w , .n. .n,,n. . . , n
. .1n. =
+ –
t
6: Update = all instance ✏weights:
X
t=1
t,i t
. t, ✏✏T
.=tt. .= wt,i t
for tfor
2:1,
=6: = 1,
. . Update ,i:y
1,
= T.i.all 6=.h,tT instance
(xiw ) weights:
i:yi 6=ht (xi ) t,i
3:Train
rain model model
Train
wt+1,i hi:y on
t model
i=
h X,
6=thw
on hyi )tX, ony(X,
with
exp with
instance instance
yt ywith
i ht (xi ))
weights
instance weights
8i w t1,w
=weights . .t. , n wt
t (xt,i ⇣
⇣(training ⌘
Compute
ompute
4: wt+1,i
the
Compute the=weighted
weighted w
the exp
training
t,i1weighted 1 ✏tt⌘ ytraining
i ht (x
error errori ))error
rate of8iht=
rate : 1,
of
rate ht.of
:. . h ,n t:
5:
5: Choose
Choose t = 1 ln ⇣ 1 ✏t ⌘
7: XNormalize X X t = wt+1 1 lnto 1be
2
2
✏ t
✏t✏ta distribution:
5:
6: Choose
Update all t w = instance ln weights:
✏7:
t = ✏t =
6: Normalize
✏t =
Update wt,iall w t,i
2w to be
instance
t+1 t,i t a distribution:
✏weights:
6: i:y 6=Update all instance wt+1,iweights:
htw
i i:y i 6=
(x
wt+1,i h
)i:y(x 6
=
t+1,i = wt,inw
i t i =
i )h tP(x i )
expt+1,i( t yyi h h8i (x=i ))))1, . .8i 8i.,n = 1, 1, .. .. .. ,, n
n
w = w P exp w ( t (x =
wt+1,i t+1,i ⇣ = t,i ⇣n⌘exp⇣
j=1 ⌘ t+1,j t i 8i
t =i 1, . .
⌘yi ht (xi )) 8i = 1, . . . , n. , n
wt+1,i = w t,i j=1
✏t11 ✏t 1 ✏t
w ( t+1,j t
5:
7:Choose
hoose t = 2
Choose
Normalize
1
t =ln 12t1ln =
w✏ t+12 ln to be
✏tto be
✏t a a distribution:
distribution:
8:
7: endNormalize
for wt t+1
7:Update
8:
pdate
6:
9: endall for
Normalize
Update
Return all the
instance instance w
allhypothesis t+1weights:
weights:
instance to be a distribution:
weights:
w t+1,i
9: Return w t+1,ithe = Pnwt+1,i
hypothesis
P 8i = = 1, 1, .. .. .. ,, n n
wt+1,i = nwt+1,i 8i !
wt+1,iwt+1,i
=w wt+1,i= w=
exp
t+1,it,i
t,i =(exp
wP (yexp
n i w
j=1
t,itj=1 hwttX(yTt+1,j
(x ih tt(x
i ))
t+1,j yiih))8i
t (x
8i = =
!i8i 1,=
))1, .. ..8i
1, .n
.. ,, n=. .1, , n. . . , n
H(x) = sign j=1 w X Tt+1,jt ht (x)
8: endNormalize
8:Normalize
ormalize
7: end for
wH(x)
for t+1wto t+1 =be wsign
to t+1 abe toat=1 at hdistribution:
distribution:
distribution:
be t (x)
9:
8: Return
end
9: Return for the the hypothesishypothesis t=1
w w w
w9: Return t+1,i t+1,i t+1,i
wt+1,i
t+1,i =w P =nthe
t+1,i P = hypothesis
n Pn 8iT=8i 1,= . .8i
1,
.,n !. .1,
.=
! , n. . . , n
w w w X
Disclaimer:   j=1 j=1
H(x) =  Nt+1,j
t+1,jj=1
sign ote  
X
X
T
T t
t+1,j
hat  
t h t
r
(x) esized  
! points  in  the  illustraIon  above  
H(x) = sign t ht (x)
for
end for for
8: end H(x) = signare   t=1
t=1 not   t htn ecessarily  to  scale  with  βt  
(x)
urn
Return the hypothesis
9: Return the hypothesisthe hypothesis t=1
23  
thethe number of iterations iT i i=1 ⇥ ⇤
4: Compute weighted training error rate of ⇥ 1 ht : 1⇤
1: Initialize aa the
1: Initialize vector
Xnumber
vector of n
of n uniform
uniform
of iterations weights
weights T w w1 = = ⇥ n1 ,, .. .. .. ,, n1 ⇤

AdaBoost
X 1
2:
1: for t
Initialize✏=t = 1, a. . . ,
vector T wt,iof n uniform weights w 1 = n
1
, . . . , n1
2: for t ✏=t = 1, . . . , T wt,i n n
3:
2: for Train
t = 1, model
i:y.i.6= . h,tT (xh i )t on X, y with instance weights wt  
3: INPUT:Train model hii )t yon =X, y with
{(x instance
n
,i ,ni=1 weights
n
, of hw
y{(x i ,{(x
INPUT:
PUT: training training
i:y training
idata
i 6=htt (x data
X, X,
data =
X, ,yyi= )} yi )} yi )} , rate t
4:
3:
4: the Compute
Train
Compute model the
the h weighted
t on⇣
weighted X, y ⌘ training
iwith
training error
instance
i=1
error weights
i=1
rate of htt : t w :
5:
4: Choose
ComputeX the
number number
the tthe
ofnumber of ⇣
1iterationsiterations ⌘
ln 1 ✏t✏ training error
= 21weighted 1 of✏titerations
T T T ⇥ 1 rate ⇥ 1 1of ⇥⇤1 h1t :⇤ 1 ⇤
5:
Initialize Choose
1: Initialize
alize a vectora✏tvector a vector
of Xnt of= n2wln
uniform uniform
of n weights t
uniform weights wweights
1 = w1 n=,w , .n. .n,,n. . . , n
. .1n. =
+ –
t
6: Update = all instance ✏weights:
X
t=2
t,i t
. t, ✏✏T
.=tt. .= wt,i t
for tfor
2:1,
=6: = 1,
. . Update ,i:y
1,
= T.i.all 6=.h,tT instance
(xiw ) weights:
i:yi 6=ht (xi ) t,i
3:Train
rain model model
Train
wt+1,i hi:y on
t model
i=
h X,
6=thw
on hyi )tX, ony(X,
with
exp with
instance instance
yt ywith
i ht (xi ))
weights
instance weights
8i w t1,w
=weights . .t. , n wt
t (xt,i ⇣
⇣(training ⌘
Compute
ompute
4: wt+1,i
the
Compute the=weighted
weighted w
the exp
training
t,i1weighted 1 ✏tt⌘ ytraining
i ht (x
error errori ))error
rate of8iht=
rate : 1,
of
rate ht.of
:. . h ,n t:
5:
5: Choose
Choose t = 1 ln ⇣ 1 ✏t ⌘
7: XNormalize X X t = wt+1 1 lnto 1be
2
2
✏ t
✏t✏ta distribution:
5:
6: Choose
Update all t w = instance ln weights:
✏7:
t = ✏t =
6: Normalize
✏t =
Update wt,iall w t,i
2w to be
instance
t+1 t,i t a distribution:
✏weights:
6: i:y 6=Update all instance wt+1,iweights:
htw
i i:y i 6=
(x
wt+1,i h
)i:y(x 6
=
t+1,i = wt,inw
i t i =
i )h tP(x i )
expt+1,i( t yyi h h8i (x=i ))))1, . .8i 8i.,n = 1, 1, .. .. .. ,, n
n
w = w P exp w ( t (x =
wt+1,i t+1,i ⇣ = t,i ⇣n⌘exp⇣
j=1 ⌘ t+1,j t i 8i
t =i 1, . .
⌘yi ht (xi )) 8i = 1, . . . , n. , n
wt+1,i = w t,i j=1
✏t11 ✏t 1 ✏t
w ( t+1,j t
5:
7:Choose
hoose t = 2
Choose
Normalize
1
t =ln 12t1ln =
w✏ 2 ln
t+1 to be
✏tto be
✏t a a distribution:
distribution:
8:
7: endNormalize
for wt t+1
7:Update
8:
pdate
6:
9: endall for
Normalize
Update
Return all the
instance instance w
allhypothesisweights:
instance to be
t+1weights: a distribution:
weights:
w t+1,i
9: Return w t+1,ithe = Pnwt+1,i
hypothesis
P 8i = = 1, 1, .. .. .. ,, n n
wt+1,i = nwt+1,i 8i !
wt+1,iwt+1,i
=w wt+1,i= w=
exp
t+1,it,i
t,i =(exp
wP (yexp
n i w
j=1
t,itj=1 hwttX(yTt+1,j
(x ih tt(x
i ))
t+1,j yiih))8i
t (x
8i = =
!i8i 1,=
))1, .. ..8i
1, .n
.. ,, n=. .1, , n. . . , n
H(x) = sign j=1 w X Tt+1,jt ht (x)
8: endNormalize
8:Normalize
ormalize
7: end for
wH(x)
for t+1wto t+1 =be wsign
tot+1abe toat=1 at hdistribution:
distribution:
distribution:
be t (x)
9:
8: Return
end
9: Return for the the hypothesishypothesis t=1
w w w
w9: Return t+1,i t+1,i t+1,i
wt+1,i
t+1,i =w P =nthe
t+1,i P = hypothesis
n Pn 8iT=8i 1,= . .8i
1,
.,n !. .1,
.=
! , n. . . , n
w w
t+1,jj=1 t+1,jXw X Tt+1,j
j=1 j=1
H(x) = sign T h (x) !
H(x) = sign X t ht (x) t t
for
end for for
8: end H(x) = sign t=1 t=1 t ht (x)
urn
Return the hypothesis
9: Return the hypothesisthe hypothesis t=1
24  
thethe number of iterations iT i i=1 ⇥ ⇤
4: Compute weighted training error rate of ⇥ 1 ht : 1⇤
1: Initialize aa the
1: Initialize vector
Xnumber
vector of n
of n uniform
uniform
of iterations weights
weights T w w1 = = ⇥ n1 ,, .. .. .. ,, n1 ⇤

AdaBoost
X 1
2:
1: for t
Initialize✏=t = 1, a. . . ,
vector T wt,iof n uniform weights w 1 = n
1
, . . . , n1
2: for t ✏=t = 1, . . . , T wt,i n n
3:
2: for Train
t = 1, model
i:y.i.6= . h,tT (xh i )t on X, y with instance weights wt  
3: INPUT:Train model hii )t yon =X, y with
{(x instance
n
,i ,ni=1 weights
n
, of hw
y{(x i ,{(x
INPUT:
PUT: training training
i:y training
idata
i 6=htt (x data
X, X,
data =
X, ,yyi= )} yi )} yi )} , rate t
4:
3:
4: the Compute
Train
Compute model the
the h weighted
t on⇣
weighted X, y ⌘ training
iwith
training error
instance
i=1
error weights
i=1
rate of htt : t w :
5:
4: Choose
ComputeX the
number number
the tthe
ofnumber of ⇣
1iterationsiterations ⌘
ln 1 ✏t✏ training error
= 21weighted 1 of✏titerations
T T T ⇥ 1 rate ⇥ 1 1of ⇥⇤1 h1t :⇤ 1 ⇤
5:
Initialize Choose
1: Initialize
alize a vectora✏tvector a vector
of Xnt of= n2wln
uniform uniform
of n weights t
uniform weights wweights
1 = w1 n=,w , .n. .n,,n. . . , n
. .1n. =
+ –
t
6: Update = all instance ✏weights:
X
t=2
t,i t
. t, ✏✏T
.=tt. .= wt,i t
for tfor
2:1,
=6: = 1,
. . Update ,i:y
1,
= T.i.all 6=.h,tT instance
(xiw ) weights:
i:yi 6=ht (xi ) t,i
3:Train
rain model model
Train
wt+1,i hi:y on
t model
i=
h X,
6=thw
on hyi )tX, ony(X,
with
exp with
instance instance
yt ywith
i ht (xi ))
weights
instance weights
8i w t1,w
=weights . .t. , n wt
t (xt,i ⇣
⇣(training ⌘
Compute
ompute
4: wt+1,i
the
Compute the=weighted
weighted w
the exp
training
t,i1weighted 1 ✏tt⌘ ytraining
i ht (x
error errori ))error
rate of8iht=
rate : 1,
of
rate ht.of
:. . h ,n t:
5:
5: Choose
Choose t = 1 ln ⇣ 1 ✏t ⌘
7: XNormalize X X t = wt+1 1 lnto 1be
2
2
✏ t
✏t✏ta distribution:
5:
6: Choose
Update all t w = instance ln weights:
✏7:
t = ✏t =
6: Normalize
✏t =
Update wt,iall w t,i
2w to be
instance
t+1 t,i t a distribution:
✏weights:
6: i:y 6=Update all instance wt+1,iweights:
htw
i i:y i 6=
(x
wt+1,i h
)i:y(x 6
=
t+1,i = wt,inw
i t i =
i )h tP(x i )
expt+1,i( t yyi h h8i (x=i ))))1, . .8i 8i.,n = 1, 1, .. .. .. ,, n
n
w = w P exp w ( t (x =
wt+1,i t+1,i ⇣ = t,i ⇣n⌘exp⇣
j=1 ⌘ t+1,j t i 8i
t =i 1, . .
⌘yi ht (xi )) 8i = 1, . . . , n. , n
wt+1,i = w t,i j=1
✏t11 ✏t 1 ✏t
w ( t+1,j t
5:
7:Choose
hoose t = 2
Choose
Normalize
1
t =ln 12t1ln =
w✏ 2 ln
t+1 to be
✏tto be
✏t a a distribution:
distribution:
8:
7: endNormalize
for wt t+1
7:Update
8:
pdate
6:
9: endall for
Normalize
Update
Return all the
instance instance w
allhypothesisweights:
instance to be
t+1weights: a distribution:
weights:
w t+1,i
9: Return w t+1,ithe = Pnwt+1,i
hypothesis
P 8i = = 1, 1, .. .. .. ,, n n
wt+1,i = nwt+1,i 8i !
wt+1,iwt+1,i
=w wt+1,i= w=
exp
t+1,it,i
t,i =(exp
wP (yexp
n i w
j=1
t,itj=1 hwttX(yTt+1,j
(x ih tt(x
i ))
t+1,j yiih))8i
t (x
8i = =
!i8i 1,=
))1, .. ..8i
1, .n
.. ,, n=. .1, , n. . . , n
H(x) = sign j=1 w X Tt+1,jt ht (x)
8: endNormalize
8:Normalize
ormalize
7: end for
wH(x)
for t+1wto t+1 =be wsign
tot+1abe toat=1 at hdistribution:
distribution:
distribution:
be t (x)
9:
8: Return
end
9: Return for the the hypothesishypothesis t=1
w w w
w9: Return t+1,i t+1,i t+1,i
wt+1,i
t+1,i =w P =nthe
t+1,i P = hypothesis
n Pn 8iT=8i 1,= . .8i
1,
.,n !. .1,
.=
! , n. . . , n
w w
t+1,jj=1 t+1,jXw X Tt+1,j
j=1 j=1
H(x) = sign T h (x) !
H(x) = sign X t ht (x) t t
for
end for for
8: end H(x) = sign t=1 t=1 t ht (x)
urn
Return the hypothesis
9: Return the hypothesisthe hypothesis t=1
25  
thethe number of iterations iT i i=1 ⇥ ⇤
4: Compute weighted training error rate of ⇥ 1 ht : 1⇤
1: Initialize aa the
1: Initialize vector
Xnumber
vector of n
of n uniform
uniform
of iterations weights
weights T w w1 = = ⇥ n1 ,, .. .. .. ,, n1 ⇤

AdaBoost
X 1
2:
1: for t
Initialize✏=t = 1, a. . . ,
vector T wt,iof n uniform weights w 1 = n
1
, . . . , n1
2: for t ✏=t = 1, . . . , T wt,i n n
3:
2: for Train
t = 1, model
i:y.i.6= . h,tT (xh i )t on X, y with instance weights wt  
3: INPUT:Train model hii )t yon =X, y with
{(x instance
n
,i ,ni=1 weights
n
, of hw
y{(x i ,{(x
INPUT:
PUT: training training
i:y training
idata
i 6=htt (x data
X, X,
data =
X, ,yyi= )} yi )} yi )} , rate t
4:
3:
4: the Compute
Train
Compute model the
the h weighted
t on⇣
weighted X, y ⌘ training
iwith
training error
instance
i=1
error weights
i=1
rate of htt : t w :
5:
4: Choose
ComputeX the
number number
the tthe
ofnumber of ⇣
1iterationsiterations ⌘
ln 1 ✏t✏ training error
= 21weighted 1 of✏titerations
T T T ⇥ 1 rate ⇥ 1 1of ⇥⇤1 h1t :⇤ 1 ⇤
5:
Initialize Choose
1: Initialize
alize a vectora✏tvector a vector
of Xnt of= n2wln
uniform uniform
of n weights t
uniform weights wweights
1 = w1 n=,w , .n. .n,,n. . . , n
. .1n. =
+ –
t
6: Update = all instance ✏weights:
X
t=2
t,i t
. t, ✏✏T
.=tt. .= wt,i t
for tfor
2:1,
=6: = 1,
. . Update ,i:y
1,
= T.i.all 6=.h,tT instance
(xiw ) weights:
i:yi 6=ht (xi ) t,i
3:Train
rain model model
Train
wt+1,i hi:y on
t model
i=
h X,
6=thw
on hyi )tX, ony(X,
with
exp with
instance instance
yt ywith
i ht (xi ))
weights
instance weights
8i w t1,w
=weights . .t. , n wt
t (xt,i ⇣
⇣(training ⌘
Compute
ompute
4: wt+1,i
the
Compute the=weighted
weighted w
the exp
training
t,i1weighted 1 ✏tt⌘ ytraining
i ht (x
error errori ))error
rate of8iht=
rate : 1,
of
rate ht.of
:. . h ,n t:
5:
5: Choose
Choose t = 1 ln ⇣ 1 ✏t ⌘
7: XNormalize X X t = wt+1 1 lnto 1be
2
2
✏ t
✏t✏ta distribution:
5:
6: Choose
Update all t w = instance ln weights:
✏7:
t = ✏t =
6: Normalize
✏t =
Update wt,iall w t,i
2w to be
instance
t+1 t,i t a distribution:
✏weights:
6: i:y 6=Update all instance wt+1,iweights:
htw
i i:y i 6=
(x
wt+1,i h
)i:y(x 6
=
t+1,i = wt,inw
i t i =
i )h tP(x i )
expt+1,i( t yyi h h8i (x=i ))))1, . .8i 8i.,n = 1, 1, .. .. .. ,, n
n
w = w P exp w ( t (x =
wt+1,i t+1,i ⇣ = t,i ⇣n⌘exp⇣
j=1 ⌘ t+1,j t i 8i
t =i 1, . .
⌘yi ht (xi )) 8i = 1, . . . , n. , n + –
wt+1,i = w t,i j=1
✏t11 ✏t 1 ✏t
w ( t+1,j t
5:
7:Choose
hoose t = 2
Choose
Normalize
1
t =ln 12t1ln =
w✏ 2 ln
t+1 to be
✏tto be
✏t a a distribution:
distribution:
8:
7: endNormalize
for wt t+1
7:Update
8:
pdate
6:
9: endall for
Normalize
Update
Return all the
instance instance w
allhypothesisweights:
instance to be
t+1weights: a distribution:
weights:
w t+1,i
9: Return w t+1,ithe = Pnwt+1,i
hypothesis
P 8i = = 1, 1, .. .. .. ,, n n
wt+1,i = nwt+1,i 8i !
wt+1,iwt+1,i
=w wt+1,i= w=
exp
t+1,it,i
t,i =(exp
wP (yexp
n i w
j=1
t,itj=1 hwttX(yTt+1,j
(x ih tt(x
i ))
t+1,j yiih))8i
t (x
8i = =
!i8i 1,=
))1, .. ..8i
1, .n
.. ,, n=. .1, , n. . . , n
H(x) = sign j=1 w X Tt+1,jt ht (x)
8: endNormalize
8:Normalize
ormalize
7: end for
wH(x)
for t+1wto t+1 =be wsign
tot+1abe toat=1 at hdistribution:
distribution:
distribution:
be t (x)
9:
8: Return
end
9: Return for the the hypothesishypothesis t=1
w w w
w9: Return t+1,i t+1,i t+1,i
wt+1,i
t+1,i =w P =nthe
t+1,i P = hypothesis
n Pn 8iT=8i 1,= . .8i
1,
.,n !. .1,
.=
! , n. . . , n
w w
t+1,jj=1 t+1,jXw X Tt+1,j
j=1 j=1
H(x) = sign T h (x) !
H(x) = sign X t ht (x) t t
for
end for for
8: end H(x) = sign t=1 t=1 t ht (x)
urn
Return the hypothesis
9: Return the hypothesisthe hypothesis t=1
26  
thethe number of iterations iT i i=1 ⇥ ⇤
4: Compute weighted training error rate of ⇥ 1 ht : 1⇤
1: Initialize aa the
1: Initialize vector
Xnumber
vector of n
of n uniform
uniform
of iterations weights
weights T w w1 = = ⇥ n1 ,, .. .. .. ,, n1 ⇤

AdaBoost
X 1
2:
1: for t
Initialize✏=t = 1, a . . . ,
vector T wt,i of n uniform weights w 1 = n
1
, . . . , n
1
2: for t ✏=t = 1, . . . , T wt,i n n
3:
2: for Train
t = 1, i:y model. i.6= . h,tT (xh i )t on X, y with instance weights wt  
3: INPUT:Train model hii )t yon =X, y with
{(x ninstance ,i ,ni=1 weights
n
, of hw
y{(x i ,{(x
INPUT:
PUT: training trainingi:y training
idata
i 6=htt (x data X, X,
data =
X, ,yyi= )} yi )} yi )} , rate t
4:
3:
4: the Compute
Train
Compute model the
the h weighted
t
weighted on ⇣ X, y ⌘ training
iwith
training instance
i=1 error
error weights
i=1
rate of htt : t w :
5:
4: Choose
ComputeX the
number number
the tthe
of number
= 21weighted of
1iterations ⇣ iterations ⌘
1 of✏titerations
ln 1 ✏t✏ training error T T T ⇥ 1 rate ⇥ 1 1of ⇥⇤1 h1t :⇤ 1 ⇤
5:
Initialize Choose
1: Initialize
alize a vectora✏tvector a vector
of Xnt of =uniform n2wln uniform
of n weights t
uniform weights w weights1 = w1 n=,w . .1n. = , .n. .n,,n. . . , n
+ –
t
6: Update = all instance ✏weights:
X
t=2
t,i t
. t, ✏✏T
.=tt. .= wt,i t
for tfor
2:1,
=6: = 1,
. . Update =1,,i:y T.i.all 6=.h,tT instance
(xiw ) weights:
i:yi 6=ht (xi ) t,i
3:Train
rain model model
Train
wt+1,i hi:y t model oni=
h X,
6=thw
on h yi )tX,exp ony(X,
with with
instance instance
yt ywith
i ht (xi ))
weights
instance weights8i w =weightst1,w . .t. , n wt
t (x t,i ⇣
⇣(training ⌘
Compute
ompute
4: wt+1,i
the
Compute the=weighted
weighted thew t,i1weighted exp
training 1 ✏tt⌘ ytraining
i ht (x
error error rate i ))error of8iht=
rate of : 1,
rate ht.of :. . h,n t:
5:
5: Choose
Choose t = 1 ln ⇣ 1 ✏t ⌘
7: XNormalize X X t = wt+1 1 lnto 1be
2
2
✏ t
✏t✏ta distribution:
5:
6: Choose
Update all t w = instance ln weights:
✏7:
t = ✏t =
6: Normalize
✏t =
Update wt,iall w t,i instance
t+1 2w to be
t,i t a distribution:
✏weights:
6: i:y 6=Update all instance wt+1,iweights:
htw
i i:y i 6=
(x
wt+1,i h) i:y
t+1,i = wt,inw
i t (x i 6
= =
i )h t P(x i )
exp t+1,i ( t yyi h h8i (x=i )) ))1, . .8i 8i .,n = 1, 1, .. .. .. ,, n n
w = w P exp w ( t (x =
wt+1,i t+1,i ⇣ = t,i ⇣n⌘exp⇣j=1 ⌘ t+1,j t i 8i
t = i
⌘yi ht (xi )) 8i = 1, . . . , n 1, . . . , + – n
wt+1,i = w t,i j=1
✏t11 ✏t 1 ✏t
w ( t+1,j t
5:
7:Choose
hoose t = 2
Choose
Normalize
1
t = ln 12t1ln =
w ✏ t+1 2 ln ✏tto to be be
✏t a a distribution:
distribution:
8:
7: endNormalize
for wt t+1
7:Update
8:
pdate
6:
9: endall for
Normalize
Update
Return all the
instance instance allhypothesis w t+1weights:
weights:
instance to be a distribution:
weights:
w t+1,i
9: Return w t+1,i the = Pnwt+1,i
hypothesis
P 8i = = 1, 1, .. .. .. ,, n n
wt+1,i = nwt+1,i 8i !
wt+1,iwt+1,i
=w wt+1,i=t+1,it,i
t,i w=
exp =(exp wP t,itj=1 (yexp
n i w
j=1 hw ttX(yTt+1,j
(x ih tt(x
i ))
t+1,j yiih))8i
t (x
8i ==
! i8i))1, 1,= .. ..8i 1,
.. ,, n .n
=. .1, , n. . . , n
H(x) = sign j=1 w X Tt+1,jt ht (x)
8: endNormalize
8:Normalize
ormalize
7: end for
wH(x)
for t+1wto t+1 =be wsign to t+1 abe toat=1
distribution: at hdistribution:
distribution:
be t (x)
9:
8: Return
end
9: Return for the the hypothesis hypothesis t=1
w w w
w9: Return t+1,i t+1,i t+1,i
wt+1,i
=w P =nthe P = hypothesis
n Pn 8iT=8i 1,= . .8i
1,
.,n .=!. .1, , n. . . , n
t+1,i
•  βt  j=1 t+1,i
measures   wj=1 t+1,jj=1 w t+1,jX w X T the  importance  
t+1,j
!
! of  ht!
H(x)
H(x) = sign X tt htt (x) = sign T h (x)
for
end •  If  
for
8: end for    ✏  t      
H(x)      sign
=    0.5
         ,t=1
 then  
t=1 t ht (x)        t              0            (  βt    grows  as      ✏  t      
gets  
0.5smaller)    
urn
Return the hypothesis
9: Return the hypothesis the hypothesis t=1
27  
thethe number of iterations iT i i=1 ⇥ ⇤
4: Compute weighted training error rate of ⇥ 1 ht : 1⇤
1: Initialize aa the
1: Initialize vector
Xnumber
vector of n
of n uniform
uniform
of iterations weights
weights T w w1 = = ⇥ n1 ,, .. .. .. ,, n1 ⇤

AdaBoost
X 1
2:
1: for t
Initialize✏=t = 1, a. . . ,
vector T wt,iof n uniform weights w 1 = n
1
, . . . , n1
2: for t ✏=t = 1, . . . , T wt,i n n
3:
2: for Train
t = 1, model
i:y.i.6= . h,tT (xh i )t on X, y with instance weights wt  
3: INPUT:Train model hii )t yon =X, y with
{(x instance
n
,i ,ni=1 weights
n
, of hw
y{(x i ,{(x
INPUT:
PUT: training training
i:y training
idata
i 6=htt (x data
X, X,
data =
X, ,yyi= )} yi )} yi )} , rate t
4:
3:
4: the Compute
Train
Compute model the
the h weighted
t on
weighted ⇣ X, y ⌘ training
iwith
training error
instance
i=1
error weights
i=1
rate of htt : t w :
5:
4: Choose
ComputeX the
number number
the tthe
ofnumber of
1iterations⇣ iterations ⌘
ln 1 ✏t✏ training error
= 21weighted 1 of✏titerations
T T T ⇥ 1 rate ⇥ 1 1of ⇥⇤1 h1t :⇤ 1 ⇤
5:
Initialize Choose
1: Initialize
alize a vectora✏tvector a vector
of Xnt of= n2wln
uniform uniform
of n weights t
uniform weights wweights
1 = w1 n=,w , .n. .n,,n. . . , n
. .1n. =
+ –
t
6: Update = all instance ✏weights:
X
t=2
t,i t
. t, ✏✏T
.=tt. .= wt,i t
for tfor
2:1,
=6: = 1,
. . Update ,i:y
1,
= T.i.all 6=.h,tT instance
(xiw ) weights:
i:yi 6=ht (xi ) t,i
3:Train
rain model model
Train
wt+1,i hi:y on
t model
i=
h X,
6=thw
on hyi )tX, ony(X,
with
exp with
instance instance
yt ywith
i ht (xi ))
weights
instance weights
8i w t1,w
=weights . .t. , n wt
t (xt,i ⇣
⇣(training ⌘
Compute
ompute
4: wt+1,i
the
Compute the=weighted
weighted w
the exp
training
t,i1weighted 1 ✏tt⌘ ytraining
i ht (x
error errori ))error
rate of8iht=
rate : 1,
of
rate ht.of
:. . h ,n t:
5:
5: Choose
Choose t = 1 ln ⇣ 1 ✏t ⌘
7: XNormalize X X t = wt+1 1 lnto 1be
2
2
✏ t
✏t✏ta distribution:
5:
6: Choose
Update all t w = instance ln weights:
✏7:
t = ✏t =
6: Normalize
✏t =
Update wt,iall w t,i
2w to be
instance
t+1 t,i t a distribution:
✏weights:
6: i:y 6=Update all instance wt+1,iweights:
htw
i i:y i 6=
(x
wt+1,i h
)i:y(x 6
=
t+1,i = wt,inw
i t i =
i )h tP(x i )
expt+1,i ( t yyi h h8i (x=i ))))1, . .8i 8i.,n = 1, 1, .. .. .. ,, n
n
w = w P exp w ( t (x =
wt+1,i t+1,i ⇣ = t,i ⇣n⌘exp⇣
j=1 ⌘ t+1,j t i 8i
t =i 1, . .
⌘yi ht (xi )) 8i = 1, . . . , n. , n+ –
wt+1,i = w t,i j=1
✏t11 ✏t 1 ✏t
w ( t+1,j t
5:
7:Choose
hoose t = 2
Choose
Normalize
1
t =ln 12t1ln =
w✏ 2 ln
t+1 to be
✏tto be
✏t a a distribution:
distribution:
8:
7: endNormalize
for wt t+1
7:Update
8:
pdate
6:
9: endall for
Normalize
Update
Return all the
instance instance w
allhypothesisweights:
instance to be
t+1weights: a distribution:
weights:
w t+1,i
9: Return w t+1,ithe = Pnwt+1,i
hypothesis
P 8i = = 1, 1, .. .. .. ,, n n
wt+1,i = nwt+1,i 8i !
wt+1,iwt+1,i
=w wt+1,i= w=
exp
t+1,it,i
t,i =(exp
wP (yexp
n i w
j=1
t,itj=1 hw ttX(yTt+1,j
(x ih tt(x
i ))
t+1,j yiih))8i
t (x
8i = =
!i8i 1,=
))1, .. ..8i
1, .n
.. ,, n=. .1, , n. . . , n
H(x) = sign j=1 w X Tt+1,jt ht (x)
8: endNormalize
8:Normalize
ormalize
7: end for
wH(x)
for t+1wto t+1 =be wsign
tot+1abe toat=1 at hdistribution:
distribution:
distribution:
be t (x)
9:
8: Return
end
9: Return for the the hypothesishypothesis t=1
w w w
w9: Return t+1,i t+1,i t+1,i
wt+1,i
=w P =nthe P = hypothesis
n Pn 8iT=8i 1,= . .8i
1,
.,n !. .1,
.= , n. . . , n
t+1,i
•  Weights   t+1,i
j=1 j=1 w w
t+1,jj=1 o f  
t+1,jX c
w Xorrect  
Tt+1,j p redicIons  
!
! are  mulIplied  by   e t
1
H(x) =
H(x) = sign sign X T t ht (x)
t ht (x)
for
end •  Weights  
for
8: end for H(x) = sign of  incorrect  
t=1
t=1 t t
h (x) predicIons  are  mulIplied  by    e
t
1
urn
Return the hypothesis
9: Return the hypothesisthe hypothesis t=1
28  
thethe number of iterations iT i i=1 ⇥ ⇤
4: Compute weighted training error rate of ⇥ 1 ht : 1⇤
1: Initialize aa the
1: Initialize vector
Xnumber
vector of n
of n uniform
uniform
of iterations weights
weights T w w1 = = ⇥ n1 ,, .. .. .. ,, n1 ⇤

AdaBoost
X 1
2:
1: for t
Initialize✏=t = 1, a. . . ,
vector T wt,iof n uniform weights w 1 = n
1
, . . . , n1
2: for t ✏=t = 1, . . . , T wt,i n n
3:
2: for Train
t = 1, model
i:y.i.6= . h,tT (xh i )t on X, y with instance weights wt  
3: INPUT:Train model hii )t yon =X, y with
{(x instance
n
,i ,ni=1 weights
n
, of hw
y{(x i ,{(x
INPUT:
PUT: training training
i:y training
idata
i 6=htt (x data
X, X,
data =
X, ,yyi= )} yi )} yi )} , rate t
4:
3:
4: the Compute
Train
Compute model the
the h weighted
t on
weighted ⇣ X, y ⌘ training
iwith
training error
instance
i=1
error weights
i=1
rate of htt : t w :
5:
4: Choose
ComputeX the
number number
the tthe
ofnumber of
1iterations⇣ iterations ⌘
ln 1 ✏t✏ training error
= 21weighted 1 of✏titerations
T T T ⇥ 1 rate ⇥ 1 1of ⇥⇤1 h1t :⇤ 1 ⇤
5:
Initialize Choose
1: Initialize
alize a vectora✏tvector a vector
of Xnt of= n2wln
uniform uniform
of n weights t
uniform weights wweights
1 = w1 n=,w , .n. .n,,n. . . , n
. .1n. =
+ –
t
6: Update = all instance ✏weights:
X
t=2
t,i t
. t, ✏✏T
.=tt. .= wt,i t
for tfor
2:1,
=6: = 1,
. . Update ,i:y
1,
= T.i.all 6=.h,tT instance
(xiw ) weights:
i:yi 6=ht (xi ) t,i
3:Train
rain model model
Train
wt+1,i hi:y on
t model
i=
h X,
6=thw
on hyi )tX, ony(X,
with
exp with
instance instance
yt ywith
i ht (xi ))
weights
instance weights
8i w t1,w
=weights . .t. , n wt
t (xt,i ⇣
⇣(training ⌘
Compute
ompute
4: wt+1,i
the
Compute the=weighted
weighted w
the exp
training
t,i1weighted 1 ✏tt⌘ ytraining
i ht (x
error errori ))error
rate of8iht=
rate : 1,
of
rate ht.of
:. . h ,n t:
5:
5: Choose
Choose t = 1 ln ⇣ 1 ✏t ⌘
7: XNormalize X X t = wt+1 1 lnto 1be
2
2
✏ t
✏t✏ta distribution:
5:
6: Choose
Update all t w = instance ln weights:
✏7:
t = ✏t =
6: Normalize
✏t =
Update wt,iall w t,i
2w to be
instance
t+1 t,i t a distribution:
✏weights:
6: i:y 6=Update all instance wt+1,iweights:
htw
i i:y i 6=
(x
wt+1,i h
)i:y(x 6
=
t+1,i = wt,inw
i t i =
i )h tP(x i )
expt+1,i ( t yyi h h8i (x=i ))))1, . .8i 8i.,n = 1, 1, .. .. .. ,, n
n
w = w P exp w ( t (x =
wt+1,i t+1,i ⇣ = t,i ⇣n⌘exp⇣
j=1 ⌘ t+1,j t i 8i
t =i 1, . .
⌘yi ht (xi )) 8i = 1, . . . , n. , n+ –
wt+1,i = w t,i j=1
✏t11 ✏t 1 ✏t
w ( t+1,j t
5:
7:Choose
hoose t = 2
Choose
Normalize
1
t =ln 12t1ln =
w✏ 2 ln
t+1 to be
✏tto be
✏t a a distribution:
distribution:
8:
7: endNormalize
for wt t+1
7:Update
8:
pdate
6:
9: endall for
Normalize
Update
Return all the
instance instance w
allhypothesisweights:
instance to be
t+1weights: a distribution:
weights:
w t+1,i
9: Return w t+1,ithe = Pnwt+1,i
hypothesis
P 8i = = 1, 1, .. .. .. ,, n n
wt+1,i = nwt+1,i 8i !
wt+1,iwt+1,i
=w wt+1,i= w=
exp
t+1,it,i
t,i =(exp
wP (yexp
n i w
j=1
t,itj=1 hw ttX(yTt+1,j
(x ih tt(x
i ))
t+1,j yiih))8i
t (x
8i = =
!i8i 1,=
))1, .. ..8i
1, .n
.. ,, n=. .1, , n. . . , n
H(x) = sign j=1 w X Tt+1,jt ht (x)
8: endNormalize
8:Normalize
ormalize
7: end for
wH(x)
for t+1wto t+1 =be wsign
tot+1abe toat=1 at hdistribution:
distribution:
distribution:
be t (x)
9:
8: Return
end
9: Return for the the hypothesishypothesis t=1
w w w
w9: Return t+1,i t+1,i t+1,i
wt+1,i
=w P =nthe P = hypothesis
n Pn 8iT=8i 1,= . .8i
1,
.,n !. .1,
.= , n. . . , n
t+1,i
•  Weights   t+1,i
j=1 j=1 w w
t+1,jj=1 o f  
t+1,jX c
w Xorrect  
Tt+1,j p redicIons  
!
! are  mulIplied  by   e t
1
H(x) =
H(x) = sign sign X T t ht (x)
t ht (x)
for
end •  Weights  
for
8: end for H(x) = sign of  incorrect  
t=1
t=1 t t
h (x) predicIons  are  mulIplied  by    e
t
1
urn
Return the hypothesis
9: Return the hypothesisthe hypothesis t=1
29  
thethe number of iterations iT i i=1 ⇥ ⇤
4: Compute weighted training error rate of ⇥ 1 ht : 1⇤
1: Initialize aa the
1: Initialize vector
Xnumber
vector of n
of n uniform
uniform
of iterations weights
weights T w w1 = = ⇥ n1 ,, .. .. .. ,, n1 ⇤

AdaBoost
X 1
2:
1: for t
Initialize✏=t = 1, a. . . ,
vector T wt,iof n uniform weights w 1 = n
1
, . . . , n1
2: for t ✏=t = 1, . . . , T wt,i n n
3:
2: for Train
t = 1, model
i:y.i.6= . h,tT (xh i )t on X, y with instance weights wt  
3: INPUT:Train model hii )t yon =X, y with
{(x instance
n
,i ,ni=1 weights
n
, of hw
y{(x i ,{(x
INPUT:
PUT: training training
i:y training
idata
i 6=htt (x data
X, X,
data =
X, ,yyi= )} yi )} yi )} , rate t
4:
3:
4: the Compute
Train
Compute model the
the h weighted
t on⇣
weighted X, y ⌘ training
iwith
training error
instance
i=1
error weights
i=1
rate of htt : t w :
5:
4: Choose
ComputeX the
number number
the tthe
ofnumber of ⇣
1iterationsiterations ⌘
ln 1 ✏t✏ training error
= 21weighted 1 of✏titerations
T T T ⇥ 1 rate ⇥ 1 1of ⇥⇤1 h1t :⇤ 1 ⇤
5:
Initialize Choose
1: Initialize
alize a vectora✏tvector a vector
of Xnt of= n2wln
uniform uniform
of n weights t
uniform weights wweights
1 = w1 n=,w , .n. .n,,n. . . , n
. .1n. =
+ –
t
6: Update = all instance ✏weights:
X
t=3
t,i t
. t, ✏✏T
.=tt. .= wt,i t
for tfor
2:1,
=6: = 1,
. . Update ,i:y
1,
= T.i.all 6=.h,tT instance
(xiw ) weights:
i:yi 6=ht (xi ) t,i
3:Train
rain model model
Train
wt+1,i hi:y on
t model
i=
h X,
6=thw
on hyi )tX, ony(X,
with
exp with
instance instance
yt ywith
i ht (xi ))
weights
instance weights
8i w t1,w
=weights . .t. , n wt
t (xt,i ⇣
⇣(training ⌘
Compute
ompute
4: wt+1,i
the
Compute the=weighted
weighted w
the exp
training
t,i1weighted 1 ✏tt⌘ ytraining
i ht (x
error errori ))error
rate of8iht=
rate : 1,
of
rate ht.of
:. . h ,n t:
5:
5: Choose
Choose t = 1 ln ⇣ 1 ✏t ⌘
7: XNormalize X X t = wt+1 1 lnto 1be
2
2
✏ t
✏t✏ta distribution:
5:
6: Choose
Update all t w = instance ln weights:
✏7:
t = ✏t =
6: Normalize
✏t =
Update wt,iall w t,i
2w to be
instance
t+1 t,i t a distribution:
✏weights:
6: i:y 6=Update all instance wt+1,iweights:
htw
i i:y i 6=
(x
wt+1,i h
)i:y(x 6
=
t+1,i = wt,inw
i t i =
i )h tP(x i )
expt+1,i( t yyi h h8i (x=i ))))1, . .8i 8i.,n = 1, 1, .. .. .. ,, n
n
w = w P exp w ( t (x =
wt+1,i t+1,i ⇣ = t,i ⇣n⌘exp⇣
j=1 ⌘ t+1,j t i 8i
t =i 1, . .
⌘yi ht (xi )) 8i = 1, . . . , n. , n + –
wt+1,i = w t,i j=1
✏t11 ✏t 1 ✏t
w ( t+1,j t
5:
7:Choose
hoose t = 2
Choose
Normalize
1
t =ln 12t1ln =
w✏ 2 ln
t+1 to be
✏tto be
✏t a a distribution:
distribution:
8:
7: endNormalize
for wt t+1
7:Update
8:
pdate
6:
9: endall for
Normalize
Update
Return all the
instance instance w
allhypothesisweights:
instance to be
t+1weights: a distribution:
weights:
w t+1,i
9: Return w t+1,ithe = Pnwt+1,i
hypothesis
P 8i = = 1, 1, .. .. .. ,, n n
wt+1,i = nwt+1,i 8i !
wt+1,iwt+1,i
=w wt+1,i= w=
exp
t+1,it,i
t,i =(exp
wP (yexp
n i w
j=1
t,itj=1 hwttX(yTt+1,j
(x ih tt(x
i ))
t+1,j yiih))8i
t (x
8i = =
!i8i 1,=
))1, .. ..8i
1, .n
.. ,, n=. .1, , n. . . , n
H(x) = sign j=1 w X Tt+1,jt ht (x)
8: endNormalize
8:Normalize
ormalize
7: end for
wH(x)
for t+1wto t+1 =be wsign
tot+1abe toat=1 at hdistribution:
distribution:
distribution:
be t (x)
9:
8: Return
end
9: Return for the the hypothesishypothesis t=1
w w w
w9: Return t+1,i t+1,i t+1,i
wt+1,i
t+1,i =w P =nthe
t+1,i P = hypothesis
n Pn 8iT=8i 1,= . .8i
1,
.,n !. .1,
.=
! , n. . . , n
w w
t+1,jj=1 t+1,jXw X Tt+1,j
j=1 j=1
H(x) = sign T h (x) !
H(x) = sign X t ht (x) t t
for
end for for
8: end H(x) = sign t=1 t=1 t ht (x)
urn
Return the hypothesis
9: Return the hypothesisthe hypothesis t=1
30  
thethe number of iterations iT i i=1 ⇥ ⇤
4: Compute weighted training error rate of ⇥ 1 ht : 1⇤
1: Initialize aa the
1: Initialize vector
Xnumber
vector of n
of n uniform
uniform
of iterations weights
weights T w w1 = = ⇥ n1 ,, .. .. .. ,, n1 ⇤

AdaBoost
X 1
2:
1: for t
Initialize✏=t = 1, a. . . ,
vector T wt,iof n uniform weights w 1 = n
1
, . . . , n1
2: for t ✏=t = 1, . . . , T wt,i n n
3:
2: for Train
t = 1, model
i:y.i.6= . h,tT (xh i )t on X, y with instance weights wt  
3: INPUT:Train model hii )t yon =X, y with
{(x instance
n
,i ,ni=1 weights
n
, of hw
y{(x i ,{(x
INPUT:
PUT: training training
i:y training
idata
i 6=htt (x data
X, X,
data =
X, ,yyi= )} yi )} yi )} , rate t
4:
3:
4: the Compute
Train
Compute model the
the h weighted
t on⇣
weighted X, y ⌘ training
iwith
training error
instance
i=1
error weights
i=1
rate of htt : t w :
5:
4: Choose
ComputeX the
number number
the tthe
ofnumber of ⇣
1iterationsiterations ⌘
ln 1 ✏t✏ training error
= 21weighted 1 of✏titerations
T T T ⇥ 1 rate ⇥ 1 1of ⇥⇤1 h1t :⇤ 1 ⇤
5:
Initialize Choose
1: Initialize
alize a vectora✏tvector a vector
of Xnt of= n2wln
uniform uniform
of n weights t
uniform weights wweights
1 = w1 n=,w , .n. .n,,n. . . , n
. .1n. =
+ –
t
6: Update = all instance ✏weights:
X
t=3
t,i t
. t, ✏✏T
.=tt. .= wt,i t
for tfor
2:1,
=6: = 1,
. . Update ,i:y
1,
= T.i.all 6=.h,tT instance
(xiw ) weights:
i:yi 6=ht (xi ) t,i
3:Train
rain model model
Train
wt+1,i hi:y on
t model
i=
h X,
6=thw
on hyi )tX, ony(X,
with
exp with
instance instance
yt ywith
i ht (xi ))
weights
instance weights
8i w t1,w
=weights . .t. , n wt
t (xt,i ⇣
⇣(training ⌘
Compute
ompute
4: wt+1,i
the
Compute the=weighted
weighted w
the exp
training
t,i1weighted 1 ✏tt⌘ ytraining
i ht (x
error errori ))error
rate of8iht=
rate : 1,
of
rate ht.of
:. . h ,n t:
5:
5: Choose
Choose t = 1 ln ⇣ 1 ✏t ⌘
7: XNormalize X X t = wt+1 1 lnto 1be
2
2
✏ t
✏t✏ta distribution:
5:
6: Choose
Update all t w = instance ln weights:
✏7:
t = ✏t =
6: Normalize
✏t =
Update wt,iall w t,i
2w to be
instance
t+1 t,i t a distribution:
✏weights:
6: i:y 6=Update all instance wt+1,iweights:
htw
i i:y i 6=
(x
wt+1,i h
)i:y(x 6
=
t+1,i = wt,inw
i t i =
i )h tP(x i )
expt+1,i( t yyi h h8i (x=i ))))1, . .8i 8i.,n = 1, 1, .. .. .. ,, n
n
w = w P exp w ( t (x =
wt+1,i t+1,i ⇣ = t,i ⇣n⌘exp⇣
j=1 ⌘ t+1,j t i 8i
t =i 1, . .
⌘yi ht (xi )) 8i = 1, . . . , n. , n + –
wt+1,i = w t,i j=1
✏t11 ✏t 1 ✏t
w ( t+1,j t
5:
7:Choose
hoose t = 2
Choose
Normalize
1
t =ln 12t1ln =
w✏ 2 ln
t+1 to be
✏tto be
✏t a a distribution:
distribution:
8:
7: endNormalize
for wt t+1
7:Update
8:
pdate
6:
9: endall for
Normalize
Update
Return all the
instance instance w
allhypothesisweights:
instance to be
t+1weights: a distribution:
weights:
w t+1,i
9: Return w t+1,ithe = Pnwt+1,i
hypothesis
P 8i = = 1, 1, .. .. .. ,, n n
wt+1,i = nwt+1,i 8i !
wt+1,iwt+1,i
=w = w=
exp =(exp
wP (yexp
n i w
hw (yTt+1,j
(x ih tt(x
i ))
t+1,j yiih))8i
t (x= i8i 1,=
))1, .. ..8i
1, .n
.. ,, n=. .1, , n. . . , n
+ –
wt+1,i j=1 ttX 8i =
t+1,it,i
t,i t,itj=1 !
H(x) = sign j=1 w X Tt+1,jt ht (x)
8: endNormalize
8:Normalize
ormalize
7: end for
wH(x)
for t+1wto t+1 =be wsign
tot+1abe toat=1 at hdistribution:
distribution:
distribution:
be t (x)
9:
8: Return
end
9: Return for the the hypothesishypothesis t=1
w w w
w9: Return t+1,i t+1,i t+1,i
wt+1,i
t+1,i =w P =nthe
t+1,i P = hypothesis
n Pn 8iT=8i 1,= . .8i
1,
.,n !. .1,
.=
! , n. . . , n
w w
t+1,jj=1 t+1,jXw X Tt+1,j
j=1 j=1
H(x) = sign T h (x) !
H(x) = sign X t ht (x) t t
for
end for for
8: end H(x) = sign t=1 t=1 t ht (x)
urn
Return the hypothesis
9: Return the hypothesisthe hypothesis t=1
31  
thethe number of iterations iT i i=1 ⇥ ⇤
4: Compute weighted training error rate of ⇥ 1 ht : 1⇤
1: Initialize aa the
1: Initialize vector
Xnumber
vector of n
of n uniform
uniform
of iterations weights
weights T w w1 = = ⇥ n1 ,, .. .. .. ,, n1 ⇤

AdaBoost
X 1
2:
1: for t
Initialize✏=t = 1, a . . . ,
vector T wt,i of n uniform weights w 1 = n
1
, . . . , n
1
2: for t ✏=t = 1, . . . , T wt,i n n
3:
2: for Train
t = 1, i:y model. i.6= . h,tT (xh i )t on X, y with instance weights wt  
3: INPUT:Train model hii )t yon =X, y with
{(x ninstance ,i ,ni=1 weights
n
, of hw
y{(x i ,{(x
INPUT:
PUT: training trainingi:y training
idata
i 6=htt (x data X, X,
data =
X, ,yyi= )} yi )} yi )} , rate t
4:
3:
4: the Compute
Train
Compute model the
the h weighted
t
weighted on ⇣ X, y ⌘ training
iwith
training instance
i=1 error
error weights
i=1
rate of htt : t w :
5:
4: Choose
ComputeX the
number number
the tthe
of number
= 21weighted of
1iterations ⇣ iterations ⌘
1 of✏titerations
ln 1 ✏t✏ training error T T T ⇥ 1 rate ⇥ 1 1of ⇥⇤1 h1t :⇤ 1 ⇤
5:
Initialize Choose
1: Initialize
alize a vectora✏tvector a vector
of Xnt of =uniform n2wln uniform
of n weights t
uniform weights w weights1 = w1 n=,w . .1n. = , .n. .n,,n. . . , n
+ –
t
6: Update = all instance ✏weights:
X
t=3
t,i t
. t, ✏✏T
.=tt. .= wt,i t
for tfor
2:1,
=6: = 1,
. . Update =1,,i:y T.i.all 6=.h,tT instance
(xiw ) weights:
i:yi 6=ht (xi ) t,i
3:Train
rain model model
Train
wt+1,i hi:y t model oni=
h X,
6=thw
on h yi )tX,exp ony(X,
with with
instance instance
yt ywith
i ht (xi ))
weights
instance weights8i w =weightst1,w . .t. , n wt
t (x t,i ⇣
⇣(training ⌘
Compute
ompute
4: wt+1,i
the
Compute the=weighted
weighted thew t,i1weighted exp
training 1 ✏tt⌘ ytraining
i ht (x
error error rate i ))error of8iht=
rate of : 1,
rate ht.of :. . h,n t:
5:
5: Choose
Choose t = 1 ln ⇣ 1 ✏t ⌘
7: XNormalize X X t = wt+1 1 lnto 1be
2
2
✏ t
✏t✏ta distribution:
5:
6: Choose
Update all t w = instance ln weights:
✏7:
t = ✏t =
6: Normalize
✏t =
Update wt,iall w t,i instance
t+1 2w to be
t,i t a distribution:
✏weights:
6: i:y 6=Update all instance wt+1,iweights:
htw
i i:y i 6=
(x
wt+1,i h) i:y
t+1,i = wt,inw
i t (x i 6
= =
i )h t P(x i )
exp t+1,i ( t yyi h h8i (x=i )) ))1, . .8i 8i .,n = 1, 1, .. .. .. ,, n n
w = w P exp w ( t (x =
wt+1,i t+1,i ⇣ = t,i ⇣n⌘exp⇣j=1 ⌘ t+1,j t i 8i
t = i
⌘yi ht (xi )) 8i = 1, . . . , n 1, . . . , + – n
wt+1,i = w t,i j=1
✏t11 ✏t 1 ✏t
w ( t+1,j t
5:
7:Choose
hoose t = 2
Choose
Normalize
1
t = ln 12t1ln =
w ✏ t+1 2 ln ✏tto to be be
✏t a a distribution:
distribution:
8:
7: endNormalize
for wt t+1
7:Update
8:
pdate
6:
9: endall for
Normalize
Update
Return all the
instance instance allhypothesis w t+1weights:
weights:
instance to be a distribution:
weights:
w t+1,i
9: Return w t+1,i the = Pnwt+1,i
hypothesis
P 8i = = 1, 1, .. .. .. ,, n n
wt+1,i = nwt+1,i 8i !
wt+1,iwt+1,i
=w = w=
exp =(exp wP (yexp
n i w
hw (yTt+1,j
(x ih tt(x
i ))
t+1,j yiih))8i
t (x = i8i))1, 1,= .. ..8i 1,
.. ,, n .n
=. .1, , n. . . , n
+ –
wt+1,i j=1 ttX 8i =
t+1,it,i
t,i t,itj=1 !
H(x) = sign j=1 w X Tt+1,jt ht (x)
8: endNormalize
8:Normalize
ormalize
7: end for
wH(x)
for t+1wto t+1 =be wsign to t+1 abe toat=1
distribution: at hdistribution:
distribution:
be t (x)
9:
8: Return
end
9: Return for the the hypothesis hypothesis t=1
w w w
w9: Return t+1,i t+1,i t+1,i
wt+1,i
=w P =nthe P = hypothesis
n Pn 8iT=8i 1,= . .8i
1,
.,n .=!. .1, , n. . . , n
t+1,i
•  βt  j=1 t+1,i
measures   wj=1 t+1,jj=1 w t+1,jX w X T the  importance  
t+1,j
!
! of  ht!
H(x)
H(x) = sign X tt htt (x) = sign T h (x)
for
end •  If  
for
8: end for    ✏  t      
H(x)      sign
=    0.5
         ,t=1
 then  
t=1 t ht (x)        t              0            (  βt    grows  as      ✏  t      
gets  
0.5smaller)    
urn
Return the hypothesis
9: Return the hypothesis the hypothesis t=1
32  
thethe number of iterations iT i i=1 ⇥ ⇤
4: Compute weighted training error rate of ⇥ 1 ht : 1⇤
1: Initialize aa the
1: Initialize vector
Xnumber
vector of n
of n uniform
uniform
of iterations weights
weights T w w1 = = ⇥ n1 ,, .. .. .. ,, n1 ⇤

AdaBoost
X 1
2:
1: for t
Initialize✏=t = 1, a. . . ,
vector T wt,iof n uniform weights w 1 = n
1
, . . . , n1
2: for t ✏=t = 1, . . . , T wt,i n n
3:
2: for Train
t = 1, model
i:y.i.6= . h,tT (xh i )t on X, y with instance weights wt  
3: INPUT:Train model hii )t yon =X, y with
{(x instance
n
,i ,ni=1 weights
n
, of hw
y{(x i ,{(x
INPUT:
PUT: training training
i:y training
idata
i 6=htt (x data
X, X,
data =
X, ,yyi= )} yi )} yi )} , rate t
4:
3:
4: the Compute
Train
Compute model the
the h weighted
t on
weighted ⇣ X, y ⌘ training
iwith
training error
instance
i=1
error weights
i=1
rate of htt : t w :
5:
4: Choose
ComputeX the
number number
the tthe
ofnumber of
1iterations⇣ iterations ⌘
ln 1 ✏t✏ training error
= 21weighted 1 of✏titerations
T T T ⇥ 1 rate ⇥ 1 1of ⇥⇤1 h1t :⇤ 1 ⇤
5:
Initialize Choose
1: Initialize
alize a vectora✏tvector a vector
of Xnt of= n2wln
uniform uniform
of n weights t
uniform weights wweights
1 = w1 n=,w , .n. .n,,n. . . , n
. .1n. =
+ –
t
6: Update = all instance ✏weights:
X
t=3
t,i t
. t, ✏✏T
.=tt. .= wt,i t
for tfor
2:1,
=6: = 1,
. . Update ,i:y
1,
= T.i.all 6=.h,tT instance
(xiw ) weights:
i:yi 6=ht (xi ) t,i
3:Train
rain model model
Train
wt+1,i hi:y on
t model
i=
h X,
6=thw
on hyi )tX, ony(X,
with
exp with
instance instance
yt ywith
i ht (xi ))
weights
instance weights
8i w t1,w
=weights . .t. , n wt
t (xt,i ⇣
⇣(training ⌘
Compute
ompute
4: wt+1,i
the
Compute the=weighted
weighted w
the exp
training
t,i1weighted 1 ✏tt⌘ ytraining
i ht (x
error errori ))error
rate of8iht=
rate : 1,
of
rate ht.of
:. . h ,n t:
5:
5: Choose
Choose t = 1 ln ⇣ 1 ✏t ⌘
7: XNormalize X X t = wt+1 1 lnto 1be
2
2
✏ t
✏t✏ta distribution:
5:
6: Choose
Update all t w = instance ln weights:
✏7:
t = ✏t =
6: Normalize
✏t =
Update wt,iall w t,i
2w to be
instance
t+1 t,i t a distribution:
✏weights:
6: i:y 6=Update all instance wt+1,iweights:
htw
i i:y i 6=
(x
wt+1,i h
)i:y(x 6
=
t+1,i = wt,inw
i t i =
i )h tP(x i )
expt+1,i ( t yyi h h8i (x=i ))))1, . .8i 8i.,n = 1, 1, .. .. .. ,, n
n
w = w P exp w ( t (x =
wt+1,i t+1,i ⇣ = t,i ⇣n⌘exp⇣
j=1 ⌘ t+1,j t i 8i
t =i 1, . .
⌘yi ht (xi )) 8i = 1, . . . , n. , n+ –
wt+1,i = w t,i j=1
✏t11 ✏t 1 ✏t
w ( t+1,j t
5:
7:Choose
hoose t = 2
Choose
Normalize
1
t =ln 12t1ln =
w✏ 2 ln
t+1 to be
✏tto be
✏t a a distribution:
distribution:
8:
7: endNormalize
for wt t+1
7:Update
8:
pdate
6:
9: endall for
Normalize
Update
Return all the
instance instance w
allhypothesisweights:
instance to be
t+1weights: a distribution:
weights:
w t+1,i
9: Return w t+1,ithe = Pnwt+1,i
hypothesis
P 8i = = 1, 1, .. .. .. ,, n n
wt+1,i = nwt+1,i 8i !
wt+1,iwt+1,i
=w = w=
exp =(exp
wP (yexp
n i w
hw (yTt+1,j
(x ih tt(x
i ))
t+1,j yiih))8i
t (x= i8i 1,=
))1, .. ..8i
1, .n
.. ,, n=. .1, , n. . . , n
+ –
wt+1,i j=1 ttX 8i =
t+1,it,i
t,i t,itj=1 !
H(x) = sign j=1 w X Tt+1,jt ht (x)
8: endNormalize
8:Normalize
ormalize
7: end for
wH(x)
for t+1wto t+1 =be wsign
tot+1abe toat=1 at hdistribution:
distribution:
distribution:
be t (x)
9:
8: Return
end
9: Return for the the hypothesishypothesis t=1
w w w
w9: Return t+1,i t+1,i t+1,i
wt+1,i
=w P =nthe P = hypothesis
n Pn 8iT=8i 1,= . .8i
1,
.,n !. .1,
.= , n. . . , n
t+1,i
•  Weights   t+1,i
j=1 j=1 w w
t+1,jj=1 o f  
t+1,jX c
w Xorrect  
Tt+1,j p redicIons  
!
! are  mulIplied  by   e t
1
H(x) =
H(x) = sign sign X T t ht (x)
t ht (x)
for
end •  Weights  
for
8: end for H(x) = sign of  incorrect  
t=1
t=1 t t
h (x) predicIons  are  mulIplied  by    e
t
1
urn
Return the hypothesis
9: Return the hypothesisthe hypothesis t=1
33  
thethe number of iterations iT i i=1 ⇥ ⇤
4: Compute weighted training error rate of ⇥ 1 ht : 1⇤
1: Initialize aa the
1: Initialize vector
Xnumber
vector of n
of n uniform
uniform
of iterations weights
weights T w w1 = = ⇥ n1 ,, .. .. .. ,, n1 ⇤

AdaBoost
X 1
2:
1: for t
Initialize✏=t = 1, a. . . ,
vector T wt,iof n uniform weights w 1 = n
1
, . . . , n1
2: for t ✏=t = 1, . . . , T wt,i n n
3:
2: for Train
t = 1, model
i:y.i.6= . h,tT (xh i )t on X, y with instance weights wt  
3: INPUT:Train model hii )t yon =X, y with
{(x instance
n
,i ,ni=1 weights
n
, of hw
y{(x i ,{(x
INPUT:
PUT: training training
i:y training
idata
i 6=htt (x data
X, X,
data =
X, ,yyi= )} yi )} yi )} , rate t
4:
3:
4: the Compute
Train
Compute model the
the h weighted
t on
weighted ⇣ X, y ⌘ training
iwith
training error
instance
i=1
error weights
i=1
rate of htt : t w :
5:
4: Choose
ComputeX the
number number
the tthe
ofnumber of
1iterations⇣ iterations ⌘
ln 1 ✏t✏ training error
= 21weighted 1 of✏titerations
T T T ⇥ 1 rate ⇥ 1 1of ⇥⇤1 h1t :⇤ 1 ⇤
5:
Initialize Choose
1: Initialize
alize a vectora✏tvector a vector
of Xnt of= n2wln
uniform uniform
of n weights t
uniform weights wweights
1 = w1 n=,w , .n. .n,,n. . . , n
. .1n. =
+ –
t
6: Update = all instance ✏weights:
X
t=3
t,i t
. t, ✏✏T
.=tt. .= wt,i t
for tfor
2:1,
=6: = 1,
. . Update ,i:y
1,
= T.i.all 6=.h,tT instance
(xiw ) weights:
i:yi 6=ht (xi ) t,i
3:Train
rain model model
Train
wt+1,i hi:y on
t model
i=
h X,
6=thw
on hyi )tX, ony(X,
with
exp with
instance instance
yt ywith
i ht (xi ))
weights
instance weights
8i w t1,w
=weights . .t. , n wt
t (xt,i ⇣
⇣(training ⌘
Compute
ompute
4: wt+1,i
the
Compute the=weighted
weighted w
the exp
training
t,i1weighted 1 ✏tt⌘ ytraining
i ht (x
error errori ))error
rate of8iht=
rate : 1,
of
rate ht.of
:. . h ,n t:
5:
5: Choose
Choose t = 1 ln ⇣ 1 ✏t ⌘
7: XNormalize X X t = wt+1 1 lnto 1be
2
2
✏ t
✏t✏ta distribution:
5:
6: Choose
Update all t w = instance ln weights:
✏7:
t = ✏t =
6: Normalize
✏t =
Update wt,iall w t,i
2w to be
instance
t+1 t,i t a distribution:
✏weights:
6: i:y 6=Update all instance wt+1,iweights:
htw
i i:y i 6=
(x
wt+1,i h
)i:y(x 6
=
t+1,i = wt,inw
i t i =
i )h tP(x i )
expt+1,i ( t yyi h h8i (x=i ))))1, . .8i 8i.,n = 1, 1, .. .. .. ,, n
n
w = w P exp w ( t (x =
wt+1,i t+1,i ⇣ = t,i ⇣n⌘exp⇣
j=1 ⌘ t+1,j t i 8i
t =i 1, . .
⌘yi ht (xi )) 8i = 1, . . . , n. , n+ –
wt+1,i = w t,i j=1
✏t11 ✏t 1 ✏t
w ( t+1,j t
5:
7:Choose
hoose t = 2
Choose
Normalize
1
t =ln 12t1ln =
w✏ 2 ln
t+1 to be
✏tto be
✏t a a distribution:
distribution:
8:
7: endNormalize
for wt t+1
7:Update
8:
pdate
6:
9: endall for
Normalize
Update
Return all the
instance instance w
allhypothesisweights:
instance to be
t+1weights: a distribution:
weights:
w t+1,i
9: Return w t+1,ithe = Pnwt+1,i
hypothesis
P 8i = = 1, 1, .. .. .. ,, n n
wt+1,i = nwt+1,i 8i !
wt+1,iwt+1,i
=w = w=
exp =(exp
wP (yexp
n i w
hw (yTt+1,j
(x ih tt(x
i ))
t+1,j yiih))8i
t (x= i8i 1,=
))1, .. ..8i
1, .n
.. ,, n=. .1, , n. . . , n
+ –
wt+1,i j=1 ttX 8i =
t+1,it,i
t,i t,itj=1 !
H(x) = sign j=1 w X Tt+1,jt ht (x)
8: endNormalize
8:Normalize
ormalize
7: end for
wH(x)
for t+1wto t+1 =be wsign
tot+1abe toat=1 at hdistribution:
distribution:
distribution:
be t (x)
9:
8: Return
end
9: Return for the the hypothesishypothesis t=1
w w w
w9: Return t+1,i t+1,i t+1,i
wt+1,i
=w P =nthe P = hypothesis
n Pn 8iT=8i 1,= . .8i
1,
.,n !. .1,
.= , n. . . , n
t+1,i
•  Weights   t+1,i
j=1 j=1 w w
t+1,jj=1 o f  
t+1,jX c
w Xorrect  
Tt+1,j p redicIons  
!
! are  mulIplied  by   e t
1
H(x) =
H(x) = sign sign X T t ht (x)
t ht (x)
for
end •  Weights  
for
8: end for H(x) = sign of  incorrect  
t=1
t=1 t t
h (x) predicIons  are  mulIplied  by    e
t
1
urn
Return the hypothesis
9: Return the hypothesisthe hypothesis t=1
34  
thethe number of iterations iT i i=1 ⇥ ⇤
4: Compute weighted training error rate of ⇥ 1 ht : 1⇤
1: Initialize aa the
1: Initialize vector
Xnumber
vector of n
of n uniform
uniform
of iterations weights
weights T w w1 = = ⇥ n1 ,, .. .. .. ,, n1 ⇤

AdaBoost
X 1
2:
1: for t
Initialize✏=t = 1, a. . . ,
vector T wt,iof n uniform weights w 1 = n
1
, . . . , n1
2: for t ✏=t = 1, . . . , T wt,i n n
3:
2: for Train
t = 1, model
i:y.i.6= . h,tT (xh i )t on X, y with instance weights wt  
3: INPUT:Train model hii )t yon =X, y with
{(x instance
n
,i ,ni=1 weights
n
, of hw
y{(x i ,{(x
INPUT:
PUT: training training
i:y training
idata
i 6=htt (x data
X, X,
data =
X, ,yyi= )} yi )} yi )} , rate t
4:
3:
4: the Compute
Train
Compute model the
the h weighted
t on⇣
weighted X, y ⌘ training
iwith
training error
instance
i=1
error weights
i=1
rate of htt : t w :
5:
4: Choose
ComputeX the
number number
the tthe
ofnumber of ⇣
1iterationsiterations ⌘
ln 1 ✏t✏ training error
= 21weighted 1 of✏titerations
T T T ⇥ 1 rate ⇥ 1 1of ⇥⇤1 h1t :⇤ 1 ⇤
5:
Initialize Choose
1: Initialize
alize a vectora✏tvector a vector
of Xnt of= n2wln
uniform uniform
of n weights t
uniform weights wweights
1 = w1 n=,w , .n. .n,,n. . . , n
. .1n. =
+ –
t
6: Update = all instance ✏weights:
X
t=4
t,i t
. t, ✏✏T
.=tt. .= wt,i t
for tfor
2:1,
=6: = 1,
. . Update ,i:y
1,
= T.i.all 6=.h,tT instance
(xiw ) weights:
i:yi 6=ht (xi ) t,i
3:Train
rain model model
Train
wt+1,i hi:y on
t model
i=
h X,
6=thw
on hyi )tX, ony(X,
with
exp with
instance instance
yt ywith
i ht (xi ))
weights
instance weights
8i w t1,w
=weights . .t. , n wt
t (xt,i ⇣
⇣(training ⌘
Compute
ompute
4: wt+1,i
the
Compute the=weighted
weighted w
the exp
training
t,i1weighted 1 ✏tt⌘ ytraining
i ht (x
error errori ))error
rate of8iht=
rate : 1,
of
rate ht.of
:. . h ,n t:
5:
5: Choose
Choose t = 1 ln ⇣ 1 ✏t ⌘
7: XNormalize X X t = wt+1 1 lnto 1be
2
2
✏ t
✏t✏ta distribution:
5:
6: Choose
Update all t w = instance ln weights:
✏7:
t = ✏t =
6: Normalize
✏t =
Update wt,iall w t,i
2w to be
instance
t+1 t,i t a distribution:
✏weights:
6: i:y 6=Update all instance wt+1,iweights:
htw
i i:y i 6=
(x
wt+1,i h
)i:y(x 6
=
t+1,i = wt,inw
i t i =
i )h tP(x i )
expt+1,i( t yyi h h8i (x=i ))))1, . .8i 8i.,n = 1, 1, .. .. .. ,, n
n
w = w P exp w ( t (x =
wt+1,i t+1,i ⇣ = t,i ⇣n⌘exp⇣
j=1 ⌘ t+1,j t i 8i
t =i 1, . .
⌘yi ht (xi )) 8i = 1, . . . , n. , n + –
wt+1,i = w t,i j=1
✏t11 ✏t 1 ✏t
w ( t+1,j t
5:
7:Choose
hoose t = 2
Choose
Normalize
1
t =ln 12t1ln =
w✏ 2 ln
t+1 to be
✏tto be
✏t a a distribution:
distribution:
8:
7: endNormalize
for wt t+1
7:Update
8:
pdate
6:
9: endall for
Normalize
Update
Return all the
instance instance w
allhypothesisweights:
instance to be
t+1weights: a distribution:
weights:
w t+1,i
9: Return w t+1,ithe = Pnwt+1,i
hypothesis
P 8i = = 1, 1, .. .. .. ,, n n
wt+1,i = nwt+1,i 8i !
wt+1,iwt+1,i
=w = w=
exp =(exp
wP (yexp
n i w
hw (yTt+1,j
(x ih tt(x
i ))
t+1,j yiih))8i
t (x= i8i 1,=
))1, .. ..8i
1, .n
.. ,, n=. .1, , n. . . , n
+ –
wt+1,i j=1 ttX 8i =
t+1,it,i
t,i t,itj=1 !
H(x) = sign j=1 w X Tt+1,jt ht (x)
8: endNormalize
8:Normalize
ormalize
7: end for
wH(x)
for t+1wto t+1 =be wsign
tot+1abe toat=1 at hdistribution:
distribution:
distribution:
be t (x)
9:
8: Return
end
9: Return for the the hypothesishypothesis t=1
w w w
w9: Return t+1,i t+1,i t+1,i
wt+1,i
t+1,i =w P =nthe
t+1,i P = hypothesis
n Pn 8iT=8i 1,= . .8i
1,
.,n !. .1,
.=
! , n. . . , n
w w
t+1,jj=1 t+1,jXw X Tt+1,j
j=1 j=1
H(x) = sign T h (x) !
H(x) = sign X t ht (x) t t
for
end for for
8: end H(x) = sign t=1 t=1 t ht (x)
urn
Return the hypothesis
9: Return the hypothesisthe hypothesis t=1
35  
thethe number of iterations iT i i=1 ⇥ ⇤
4: Compute weighted training error rate of ⇥ 1 ht : 1⇤
1: Initialize aa the
1: Initialize vector
Xnumber
vector of n
of n uniform
uniform
of iterations weights
weights T w w1 = = ⇥ n1 ,, .. .. .. ,, n1 ⇤

AdaBoost
X 1
2:
1: for t
Initialize✏=t = 1, a. . . ,
vector T wt,iof n uniform weights w 1 = n
1
, . . . , n1
2: for t ✏=t = 1, . . . , T wt,i n n
3:
2: for Train
t = 1, model
i:y.i.6= . h,tT (xh i )t on X, y with instance weights wt  
3: INPUT:Train model hii )t yon =X, y with
{(x instance
n
,i ,ni=1 weights
n
, of hw
y{(x i ,{(x
INPUT:
PUT: training training
i:y training
idata
i 6=htt (x data
X, X,
data =
X, ,yyi= )} yi )} yi )} , rate t
4:
3:
4: the Compute
Train
Compute model the
the h weighted
t on⇣
weighted X, y ⌘ training
iwith
training error
instance
i=1
error weights
i=1
rate of htt : t w :
5:
4: Choose
ComputeX the
number number
the tthe
ofnumber of ⇣
1iterationsiterations ⌘
ln 1 ✏t✏ training error
= 21weighted 1 of✏titerations
T T T ⇥ 1 rate ⇥ 1 1of ⇥⇤1 h1t :⇤ 1 ⇤
5:
Initialize Choose
1: Initialize
alize a vectora✏tvector a vector
of Xnt of= n2wln
uniform uniform
of n weights t
uniform weights wweights
1 = w1 n=,w , .n. .n,,n. . . , n
. .1n. =
+ –
t
6: Update = all instance ✏weights:
X
t=4
t,i t
. t, ✏✏T
.=tt. .= wt,i t
for tfor
2:1,
=6: = 1,
. . Update ,i:y
1,
= T.i.all 6=.h,tT instance
(xiw ) weights:
i:yi 6=ht (xi ) t,i
3:Train
rain model model
Train
wt+1,i hi:y on
t model
i=
h X,
6=thw
on hyi )tX, ony(X,
with
exp with
instance instance
yt ywith
i ht (xi ))
weights
instance weights
8i w t1,w
=weights . .t. , n wt
t (xt,i ⇣
⇣(training ⌘
Compute
ompute
4: wt+1,i
the
Compute the=weighted
weighted w
the exp
training
t,i1weighted 1 ✏tt⌘ ytraining
i ht (x
error errori ))error
rate of8iht=
rate : 1,
of
rate ht.of
:. . h ,n t:
5:
5: Choose
Choose t = 1 ln ⇣ 1 ✏t ⌘
7: XNormalize X X t = wt+1 1 lnto 1be
2
2
✏ t
✏t✏ta distribution:
5:
6: Choose
Update all t w = instance ln weights:
✏7:
t = ✏t =
6: Normalize
✏t =
Update wt,iall w t,i
2w to be
instance
t+1 t,i t a distribution:
✏weights:
6: i:y 6=Update all instance wt+1,iweights:
htw
i i:y i 6=
(x
wt+1,i h
)i:y(x 6
=
t+1,i = wt,inw
i t i =
i )h tP(x i )
expt+1,i( t yyi h h8i (x=i ))))1, . .8i 8i.,n = 1, 1, .. .. .. ,, n
n
w = w P exp w ( t (x =
wt+1,i t+1,i ⇣ = t,i ⇣n⌘exp⇣
j=1 ⌘ t+1,j t i 8i
t =i 1, . .
⌘yi ht (xi )) 8i = 1, . . . , n. , n + –
wt+1,i = w t,i j=1
✏t11 ✏t 1 ✏t
w ( t+1,j t
5:
7:Choose
hoose t = 2
Choose
Normalize
1
t =ln 12t1ln =
w✏ 2 ln
t+1 to be
✏tto be
✏t a a distribution:
distribution:
8:
7: endNormalize
for wt t+1
7:Update
8:
pdate
6:
9: endall
Return
9: Return
for
Normalize
Update
w
all the
instance
t+1,i
instance
the
wt+1,i = nwt+1,i =
w
allhypothesis
Pnwt+1,i
hypothesis
P
w to be
t+1weights:
weights:
instance
t+1,i
a distribution:
weights:
8i =
8i = 1,
!
1, .. .. .. ,, n n +
wt+1,iwt+1,i
=w = w=
exp =(exp
wP (yexp
n i w
hw (yTt+1,j
(x ih tt(x
i ))
t+1,j yiih))8i
t (x= i8i 1,=
))1, .. ..8i
1, .n
.. ,, n=. .1, , n. . . , n
+ –
wt+1,i j=1 ttX 8i =
t+1,it,i
t,i t,itj=1 !
H(x) = sign j=1 w X Tt+1,jt ht (x)
8: endNormalize
8:Normalize
ormalize
7: end for
wH(x)
for t+1wto t+1 =be wsign
tot+1abe toat=1 at hdistribution:
distribution:
distribution:
be t (x)
9:
8: Return
end
9: Return for the the hypothesishypothesis t=1
w w w
w9: Return t+1,i t+1,i t+1,i
wt+1,i
t+1,i =w P =nthe
t+1,i P = hypothesis
n Pn 8iT=8i 1,= . .8i
1,
.,n !. .1,
.=
! , n. . . , n
w w
t+1,jj=1 t+1,jXw X Tt+1,j
j=1 j=1
H(x) = sign T h (x) !
H(x) = sign X t ht (x) t t
for
end for for
8: end H(x) = sign t=1 t=1 t ht (x)
urn
Return the hypothesis
9: Return the hypothesisthe hypothesis t=1
36  
thethe number of iterations iT i i=1 ⇥ ⇤
4: Compute weighted training error rate of ⇥ 1 ht : 1⇤
1: Initialize aa the
1: Initialize vector
Xnumber
vector of n
of n uniform
uniform
of iterations weights
weights T w w1 = = ⇥ n1 ,, .. .. .. ,, n1 ⇤

AdaBoost
X 1
2:
1: for t
Initialize✏=t = 1, a. . . ,
vector T wt,iof n uniform weights w 1 = n
1
, . . . , n1
2: for t ✏=t = 1, . . . , T wt,i n n
3:
2: for Train
t = 1, model
i:y.i.6= . h,tT (xh i )t on X, y with instance weights wt  
3: INPUT:Train model hii )t yon =X, y with
{(x instance
n
,i ,ni=1 weights
n
, of hw
y{(x i ,{(x
INPUT:
PUT: training training
i:y training
idata
i 6=htt (x data
X, X,
data =
X, ,yyi= )} yi )} yi )} , rate t
4:
3:
4: the Compute
Train
Compute model the
the h weighted
t on⇣
weighted X, y ⌘ training
iwith
training error
instance
i=1
error weights
i=1
rate of htt : t w :
5:
4: Choose
ComputeX the
number number
the tthe
ofnumber of ⇣
1iterationsiterations ⌘
ln 1 ✏t✏ training error
= 21weighted 1 of✏titerations
T T T ⇥ 1 rate ⇥ 1 1of ⇥⇤1 h1t :⇤ 1 ⇤
5:
Initialize Choose
1: Initialize
alize a vectora✏tvector a vector
of Xnt of= n2wln
uniform uniform
of n weights t
uniform weights wweights
1 = w1 n=,w , .n. .n,,n. . . , n
. .1n. =
+ –
t
6: Update = all instance ✏weights:
X
t=4
t,i t
. t, ✏✏T
.=tt. .= wt,i t
for tfor
2:1,
=6: = 1,
. . Update ,i:y
1,
= T.i.all 6=.h,tT instance
(xiw ) weights:
i:yi 6=ht (xi ) t,i
3:Train
rain model model
Train
wt+1,i hi:y on
t model
i=
h X,
6=thw
on hyi )tX, ony(X,
with
exp with
instance instance
yt ywith
i ht (xi ))
weights
instance weights
8i w t1,w
=weights . .t. , n wt
t (xt,i ⇣
⇣(training ⌘
Compute
ompute
4: wt+1,i
the
Compute the=weighted
weighted w
the exp
training
t,i1weighted 1 ✏tt⌘ ytraining
i ht (x
error errori ))error
rate of8iht=
rate : 1,
of
rate ht.of
:. . h ,n t:
5:
5: Choose
Choose t = 1 ln ⇣ 1 ✏t ⌘
7: XNormalize X X t = wt+1 1 lnto 1be
2
2
✏ t
✏t✏ta distribution:
5:
6: Choose
Update all t w = instance ln weights:
✏7:
t = ✏t =
6: Normalize
✏t =
Update wt,iall w t,i
2w to be
instance
t+1 t,i t a distribution:
✏weights:
6: i:y 6=Update all instance wt+1,iweights:
htw
i i:y i 6=
(x
wt+1,i h
)i:y(x 6
=
t+1,i = wt,inw
i t i =
i )h tP(x i )
expt+1,i( t yyi h h8i (x=i ))))1, . .8i 8i.,n = 1, 1, .. .. .. ,, n
n
w = w P exp w ( t (x =
wt+1,i t+1,i ⇣ = t,i ⇣n⌘exp⇣
j=1 ⌘ t+1,j t i 8i
t =i 1, . .
⌘yi ht (xi )) 8i = 1, . . . , n. , n + –
wt+1,i = w t,i j=1
✏t11 ✏t 1 ✏t
w ( t+1,j t
5:
7:Choose
hoose t = 2
Choose
Normalize
1
t =ln 12t1ln =
w✏ 2 ln
t+1 to be
✏tto be
✏t a a distribution:
distribution:
8:
7: endNormalize
for wt t+1
7:Update
8:
pdate
6:
9: endall
Return
9: Return
for
Normalize
Update
w
all the
instance
t+1,i
instance
the
wt+1,i = nwt+1,i =
w
allhypothesis
Pnwt+1,i
hypothesis
P
w to be
t+1weights:
weights:
instance
t+1,i
a distribution:
weights:
8i =
8i = 1,
!
1, .. .. .. ,, n n +
wt+1,iwt+1,i
=w = w=
exp =(exp
wP (yexp
n i w
hw (yTt+1,j
(x ih tt(x
i ))
t+1,j yiih))8i
t (x= i8i 1,=
))1, .. ..8i
1, .n
.. ,, n=. .1, , n. . . , n
+ –
wt+1,i j=1 ttX 8i =
t+1,it,i
t,i t,itj=1 !
H(x) = sign j=1 w X Tt+1,jt ht (x)
8: endNormalize
8:Normalize
ormalize
7: end for
wH(x)
for t+1wto t+1 =be wsign
tot+1abe toat=1 at hdistribution:
distribution:
distribution:
be t (x)
9:
8: Return
end
9: Return for the the hypothesishypothesis t=1
w w w
w9: Return t+1,i t+1,i t+1,i
wt+1,i
t+1,i =w P =nthe
t+1,i P = hypothesis
n Pn 8iT=8i 1,= . .8i
1,
.,n !. .1,
.=
! , n. . . , n
w w
t+1,jj=1 t+1,jXw X Tt+1,j
j=1 j=1
H(x) = sign T h (x) !
H(x) = sign X t ht (x) t t
for
end for for
8: end H(x) = sign t=1 t=1 t ht (x)
urn
Return the hypothesis
9: Return the hypothesisthe hypothesis t=1
37  
thethe number of iterations iT i i=1 ⇥ ⇤
4: Compute weighted training error rate of ⇥ 1 ht : 1⇤
1: Initialize aa the
1: Initialize vector
Xnumber
vector of n
of n uniform
uniform
of iterations weights
weights T w w1 = = ⇥ n1 ,, .. .. .. ,, n1 ⇤

AdaBoost
X 1
2:
1: for t
Initialize✏=t = 1, a. . . ,
vector T wt,iof n uniform weights w 1 = n
1
, . . . , n1
2: for t ✏=t = 1, . . . , T wt,i n n
3:
2: for Train
t = 1, model
i:y.i.6= . h,tT (xh i )t on X, y with instance weights wt  
3: INPUT:Train model hii )t yon =X, y with
{(x instance
n
,i ,ni=1 weights
n
, of hw
y{(x i ,{(x
INPUT:
PUT: training training
i:y training
idata
i 6=htt (x data
X, X,
data =
X, ,yyi= )} yi )} yi )} , rate t
4:
3:
4: the Compute
Train
Compute model the
the h weighted
t on⇣
weighted X, y ⌘ training
iwith
training error
instance
i=1
error weights
i=1
rate of htt : t w :
5:
4: Choose
ComputeX the
number number
the tthe
ofnumber of ⇣
1iterationsiterations ⌘
ln 1 ✏t✏ training error
= 21weighted 1 of✏titerations
T T T ⇥ 1 rate ⇥ 1 1of ⇥⇤1 h1t :⇤ 1 ⇤
5:
Initialize Choose
1: Initialize
alize a vectora✏tvector a vector
of Xnt of= n2wln
uniform uniform
of n weights t
uniform weights wweights
1 = w1 n=,w , .n. .n,,n. . . , n
. .1n. =
+ –
t
6: Update = all instance ✏weights:
X
t=5
t,i t
. t, ✏✏T
.=tt. .= wt,i t
for tfor
2:1,
=6: = 1,
. . Update ,i:y
1,
= T.i.all 6=.h,tT instance
(xiw ) weights:
i:yi 6=ht (xi ) t,i
3:Train
rain model model
Train
wt+1,i hi:y on
t model
i=
h X,
6=thw
on hyi )tX, ony(X,
with
exp with
instance instance
yt ywith
i ht (xi ))
weights
instance weights
8i w t1,w
=weights . .t. , n wt
t (xt,i ⇣
⇣(training ⌘
Compute
ompute
4: wt+1,i
the
Compute the=weighted
weighted w
the exp
training
t,i1weighted 1 ✏tt⌘ ytraining
i ht (x
error errori ))error
rate of8iht=
rate : 1,
of
rate ht.of
:. . h ,n t:
5:
5: Choose
Choose t = 1 ln ⇣ 1 ✏t ⌘
7: XNormalize X X t = wt+1 1 lnto 1be
2
2
✏ t
✏t✏ta distribution:
5:
6: Choose
Update all t w = instance ln weights:
✏7:
t = ✏t =
6: Normalize
✏t =
Update wt,iall w t,i
2w to be
instance
t+1 t,i t a distribution:
✏weights:
6: i:y 6=Update all instance wt+1,iweights:
htw
i i:y i 6=
(x
wt+1,i h
)i:y(x 6
=
t+1,i = wt,inw
i t i =
i )h tP(x i )
expt+1,i( t yyi h h8i (x=i ))))1, . .8i 8i.,n = 1, 1, .. .. .. ,, n
n
w = w P exp w ( t (x =
wt+1,i t+1,i ⇣ = t,i ⇣n⌘exp⇣
j=1 ⌘ t+1,j t i 8i
t =i 1, . .
⌘yi ht (xi )) 8i = 1, . . . , n. , n + –
wt+1,i = w t,i j=1
✏t11 ✏t 1 ✏t
w ( t+1,j t
5:
7:Choose
hoose t = 2
Choose
Normalize
1
t =ln 12t1ln =
w✏ 2 ln
t+1 to be
✏tto be
✏t a a distribution:
distribution:
8:
7: endNormalize
for wt t+1
7:Update
8:
pdate
6:
9: endall
Return
9: Return
for
Normalize
Update
w
all the
instance
t+1,i
instance
the
wt+1,i = nwt+1,i =
w
allhypothesis
Pnwt+1,i
hypothesis
P
w to be
t+1weights:
weights:
instance
t+1,i
a distribution:
weights:
8i =
8i = 1,
!
1, .. .. .. ,, n n +
wt+1,iwt+1,i
=w = w=
exp =(exp
wP (yexp
n i w
hw (yTt+1,j
(x ih tt(x
i ))
t+1,j yiih))8i
t (x= i8i 1,=
))1, .. ..8i
1, .n
.. ,, n=. .1, , n. . . , n
+ –
wt+1,i j=1 ttX 8i =
t+1,it,i
t,i t,itj=1 !
H(x) = sign j=1 w X Tt+1,jt ht (x)
8: endNormalize
8:Normalize
ormalize
7: end for
wH(x)
for t+1wto t+1 =be wsign
tot+1abe toat=1 at hdistribution:
distribution:
distribution:
be t (x)
9:
8: Return
end
9: Return for the the hypothesishypothesis t=1
w w w
w9: Return t+1,i t+1,i t+1,i
wt+1,i
t+1,i =w P =nthe
t+1,i P = hypothesis
n Pn 8iT=8i 1,= . .8i
1,
.,n !. .1,
.=
! , n. . . , n
w w
t+1,jj=1 t+1,jXw X Tt+1,j
j=1 j=1
H(x) = sign T h (x) !
H(x) = sign X t ht (x) t t
for
end for for
8: end H(x) = sign t=1 t=1 t ht (x)
urn
Return the hypothesis
9: Return the hypothesisthe hypothesis t=1
38  
thethe number of iterations iT i i=1 ⇥ ⇤
4: Compute weighted training error rate of ⇥ 1 ht : 1⇤
1: Initialize aa the
1: Initialize vector
Xnumber
vector of n
of n uniform
uniform
of iterations weights
weights T w w1 = = ⇥ n1 ,, .. .. .. ,, n1 ⇤

AdaBoost
X 1
2:
1: for t
Initialize✏=t = 1, a. . . ,
vector T wt,iof n uniform weights w 1 = n
1
, . . . , n1
2: for t ✏=t = 1, . . . , T wt,i n n
3:
2: for Train
t = 1, model
i:y.i.6= . h,tT (xh i )t on X, y with instance weights wt  
3: INPUT:Train model hii )t yon =X, y with
{(x instance
n
,i ,ni=1 weights
n
, of hw
y{(x i ,{(x
INPUT:
PUT: training training
i:y training
idata
i 6=htt (x data
X, X,
data =
X, ,yyi= )} yi )} yi )} , rate t
4:
3:
4: the Compute
Train
Compute model the
the h weighted
t on⇣
weighted X, y ⌘ training
iwith
training error
instance
i=1
error weights
i=1
rate of htt : t w :
5:
4: Choose
ComputeX the
number number
the tthe
ofnumber of ⇣
1iterationsiterations ⌘
ln 1 ✏t✏ training error
= 21weighted 1 of✏titerations
T T T ⇥ 1 rate ⇥ 1 1of ⇥⇤1 h1t :⇤ 1 ⇤
5:
Initialize Choose
1: Initialize
alize a vectora✏tvector a vector
of Xnt of= n2wln
uniform uniform
of n weights t
uniform weights wweights
1 = w1 n=,w , .n. .n,,n. . . , n
. .1n. =
+ –
t
6: Update = all instance ✏weights:
X
t=5
t,i t
. t, ✏✏T
.=tt. .= wt,i t
for tfor
2:1,
=6: = 1,
. . Update ,i:y
1,
= T.i.all 6=.h,tT instance
(xiw ) weights:
i:yi 6=ht (xi ) t,i
3:Train
rain model model
Train
wt+1,i hi:y on
t model
i=
h X,
6=thw
on hyi )tX, ony(X,
with
exp with
instance instance
yt ywith
i ht (xi ))
weights
instance weights
8i w t1,w
=weights . .t. , n wt
t (xt,i ⇣
⇣(training ⌘
Compute
ompute
4: wt+1,i
the
Compute the=weighted
weighted w
the exp
training
t,i1weighted 1 ✏tt⌘ ytraining
i ht (x
error errori ))error
rate of8iht=
rate : 1,
of
rate ht.of
:. . h ,n t:
5:
5: Choose
Choose t = 1 ln ⇣ 1 ✏t ⌘
7: XNormalize X X t = wt+1 1 lnto 1be
2
2
✏ t
✏t✏ta distribution:
5:
6: Choose
Update all t w = instance ln weights:
✏7: Normalize 2w to be t a distribution:
✏weights:
t = ✏t =
+
6: ✏t =
Update wt,iall w t,iinstance
t+1 t,i
6: i:y 6=Update all instance wt+1,iweights:
htw
i i:y i 6=
(x
wt+1,i h
)i:y(x 6
=
t+1,i = wt,inw
i t i =
i )h tP(x i )
expt+1,i( t yyi h h8i (x=i ))))1, . .8i 8i.,n = 1, 1, .. .. .. ,, n
n
w = w P exp w ( t (x =
wt+1,i t+1,i ⇣ = t,i ⇣n⌘exp⇣
j=1 ⌘ t+1,j t i 8i
t =i 1, . .
⌘yi ht (xi )) 8i = 1, . . . , n. , n + –
wt+1,i = w t,i j=1
✏t11 ✏t 1 ✏t
w ( t+1,j t
5:
7:Choose
hoose t = 2
Choose
Normalize
1
t =ln 12t1ln =
w✏ 2 ln
t+1 to be
✏tto be
✏t a a distribution:
distribution:
8:
7: endNormalize
for wt t+1
7:Update
8:
pdate
6:
9: endall
Return
9: Return
for
Normalize
Update
w
all the
instance
t+1,i
instance
the
wt+1,i = nwt+1,i =
w
allhypothesis
Pnwt+1,i
hypothesis
P
w to be
t+1weights:
weights:
instance
t+1,i
a distribution:
weights:
8i =
8i = 1,
!
1, .. .. .. ,, n n +
wt+1,iwt+1,i
=w = w=
exp =(exp
wP (yexp
n i w
hw (yTt+1,j
(x ih tt(x
i ))
t+1,j yiih))8i
t (x= i8i 1,=
))1, .. ..8i
1, .n
.. ,, n=. .1, , n. . . , n
+ –
wt+1,i j=1 ttX 8i =
t+1,it,i
t,i t,itj=1 !
H(x) = sign j=1 w X Tt+1,jt ht (x)
8: endNormalize
8:Normalize
ormalize
7: end for
wH(x)
for t+1wto t+1 =be wsign
tot+1abe toat=1 at hdistribution:
distribution:
distribution:
be t (x)
9:
8: Return
end
9: Return for the the hypothesishypothesis t=1
w w w
w9: Return t+1,i t+1,i t+1,i
wt+1,i
t+1,i =w P =nthe
t+1,i P = hypothesis
n Pn 8iT=8i 1,= . .8i
1,
.,n !. .1,
.=
! , n. . . , n
w w
t+1,jj=1 t+1,jXw X Tt+1,j
j=1 j=1
H(x) = sign T h (x) !
H(x) = sign X t ht (x) t t
for
end for for
8: end H(x) = sign t=1 t=1 t ht (x)
urn
Return the hypothesis
9: Return the hypothesisthe hypothesis t=1
39  
thethe number of iterations iT i i=1 ⇥ ⇤
4: Compute weighted training error rate of ⇥ 1 ht : 1⇤
1: Initialize aa the
1: Initialize vector
Xnumber
vector of n
of n uniform
uniform
of iterations weights
weights T w w1 = = ⇥ n1 ,, .. .. .. ,, n1 ⇤

AdaBoost
X 1
2:
1: for t
Initialize✏=t = 1, a. . . ,
vector T wt,iof n uniform weights w 1 = n
1
, . . . , n1
2: for t ✏=t = 1, . . . , T wt,i n n
3:
2: for Train
t = 1, model
i:y.i.6= . h,tT (xh i )t on X, y with instance weights wt  
3: INPUT:Train model hii )t yon =X, y with
{(x instance
n
,i ,ni=1 weights
n
, of hw
y{(x i ,{(x
INPUT:
PUT: training training
i:y training
idata
i 6=htt (x data
X, X,
data =
X, ,yyi= )} yi )} yi )} , rate t
4:
3:
4: the Compute
Train
Compute model the
the h weighted
t on⇣
weighted X, y ⌘ training
iwith
training error
instance
i=1
error weights
i=1
rate of htt : t w :
5:
4: Choose
ComputeX the
number number
the tthe
ofnumber of ⇣
1iterationsiterations ⌘
ln 1 ✏t✏ training error
= 21weighted 1 of✏titerations
T T T ⇥ 1 rate ⇥ 1 1of ⇥⇤1 h1t :⇤ 1 ⇤
5:
Initialize Choose
1: Initialize
alize a vectora✏tvector a vector
of Xnt of= n2wln
uniform uniform
of n weights t
uniform weights wweights
1 = w1 n=,w , .n. .n,,n. . . , n
. .1n. =
+ –
t
6: Update = all instance ✏weights:
X
t=5
t,i t
. t, ✏✏T
.=tt. .= wt,i t
for tfor
2:1,
=6: = 1,
. . Update ,i:y
1,
= T.i.all 6=.h,tT instance
(xiw ) weights:
i:yi 6=ht (xi ) t,i
3:Train
rain model model
Train
wt+1,i hi:y on
t model
i=
h X,
6=thw
on hyi )tX, ony(X,
with
exp with
instance instance
yt ywith
i ht (xi ))
weights
instance weights
8i w t1,w
=weights . .t. , n wt
t (xt,i ⇣
⇣(training ⌘
Compute
ompute
4: wt+1,i
the
Compute the=weighted
weighted w
the exp
training
t,i1weighted 1 ✏tt⌘ ytraining
i ht (x
error errori ))error
rate of8iht=
rate : 1,
of
rate ht.of
:. . h ,n t:
5:
5: Choose
Choose t = 1 ln ⇣ 1 ✏t ⌘
7: XNormalize X X t = wt+1 1 lnto 1be
2
2
✏ t
✏t✏ta distribution:
5:
6: Choose
Update all t w = instance ln weights:
✏7: Normalize 2w to be t a distribution:
✏weights:
t = ✏t =
+
6: ✏t =
Update wt,iall w t,iinstance
t+1 t,i
6: i:y 6=Update all instance wt+1,iweights:
htw
i i:y i 6=
(x
wt+1,i h
)i:y(x 6
=
t+1,i = wt,inw
i t i =
i )h tP(x i )
expt+1,i( t yyi h h8i (x=i ))))1, . .8i 8i.,n = 1, 1, .. .. .. ,, n
n
w = w P exp w ( t (x =
wt+1,i t+1,i ⇣ = t,i ⇣n⌘exp⇣
j=1 ⌘ t+1,j t i 8i
t =i 1, . .
⌘yi ht (xi )) 8i = 1, . . . , n. , n + –
wt+1,i = w t,i j=1
✏t11 ✏t 1 ✏t
w ( t+1,j t
5:
7:Choose
hoose t = 2
Choose
Normalize
1
t =ln 12t1ln =
w✏ 2 ln
t+1 to be
✏tto be
✏t a a distribution:
distribution:
8:
7: endNormalize
for wt t+1
7:Update
8:
pdate
6:
9: endall
Return
9: Return
for
Normalize
Update
w
all the
instance
t+1,i
instance
the
wt+1,i = nwt+1,i =
w
allhypothesis
Pnwt+1,i
hypothesis
P
w to be
t+1weights:
weights:
instance
t+1,i
a distribution:
weights:
8i =
8i = 1,
!
1, .. .. .. ,, n n +
wt+1,iwt+1,i
=w = w=
exp =(exp
wP (yexp
n i w
hw (yTt+1,j
(x ih tt(x
i ))
t+1,j yiih))8i
t (x= i8i 1,=
))1, .. ..8i
1, .n
.. ,, n=. .1, , n. . . , n
+ –
wt+1,i j=1 ttX 8i =
t+1,it,i
t,i t,itj=1 !
H(x) = sign j=1 w X Tt+1,jt ht (x)
8: endNormalize
8:Normalize
ormalize
7: end for
wH(x)
for t+1wto t+1 =be wsign
tot+1abe toat=1 at hdistribution:
distribution:
distribution:
be t (x)
9:
8: Return
end
9: Return for the the hypothesishypothesis t=1
w w w
w9: Return t+1,i t+1,i t+1,i
wt+1,i
t+1,i =w P =nthe
t+1,i P = hypothesis
n Pn 8iT=8i 1,= . .8i
1,
.,n !. .1,
.=
! , n. . . , n
w w
t+1,jj=1 t+1,jXw X Tt+1,j
j=1 j=1
H(x) = sign T h (x) !
H(x) = sign X t ht (x) t t
for
end for for
8: end H(x) = sign t=1 t=1 t ht (x)
urn
Return the hypothesis
9: Return the hypothesisthe hypothesis t=1
40  
thethe number of iterations iT i i=1 ⇥ ⇤
4: Compute weighted training error rate of ⇥ 1 ht : 1⇤
1: Initialize aa the
1: Initialize vector
Xnumber
vector of n
of n uniform
uniform
of iterations weights
weights T w w1 = = ⇥ n1 ,, .. .. .. ,, n1 ⇤

AdaBoost
X 1
2:
1: for t
Initialize✏=t = 1, a. . . ,
vector T wt,iof n uniform weights w 1 = n
1
, . . . , n1
2: for t ✏=t = 1, . . . , T wt,i n n
3:
2: for Train
t = 1, model
i:y.i.6= . h,tT (xh i )t on X, y with instance weights wt  
3: INPUT:Train model hii )t yon =X, y with
{(x instance
n
,i ,ni=1 weights
n
, of hw
y{(x i ,{(x
INPUT:
PUT: training training
i:y training
idata
i 6=htt (x data
X, X,
data =
X, ,yyi= )} yi )} yi )} , rate t
4:
3:
4: the Compute
Train
Compute model the
the h weighted
t on⇣
weighted X, y ⌘ training
iwith
training error
instance
i=1
error weights
i=1
rate of htt : t w :
5:
4: Choose
ComputeX the
number number
the tthe
ofnumber of ⇣
1iterationsiterations ⌘
ln 1 ✏t✏ training error
= 21weighted 1 of✏titerations
T T T ⇥ 1 rate ⇥ 1 1of ⇥⇤1 h1t :⇤ 1 ⇤
5:
Initialize Choose
1: Initialize
alize a vectora✏tvector a vector
of Xnt of= n2wln
uniform uniform
of n weights t
uniform weights wweights
1 = w1 n=,w , .n. .n,,n. . . , n
. .1n. =
+ –
t
6: Update = all instance ✏weights:
X
t=6
t,i t
. t, ✏✏T
.=tt. .= wt,i t
for tfor
2:1,
=6: = 1,
. . Update ,i:y
1,
= T.i.all 6=.h,tT instance
(xiw ) weights:
i:yi 6=ht (xi ) t,i
3:Train
rain model model
Train
wt+1,i hi:y on
t model
i=
h X,
6=thw
on hyi )tX, ony(X,
with
exp with
instance instance
yt ywith
i ht (xi ))
weights
instance weights
8i w t1,w
=weights . .t. , n wt
t (xt,i ⇣
⇣(training ⌘
Compute
ompute
4: wt+1,i
the
Compute the=weighted
weighted w
the exp
training
t,i1weighted 1 ✏tt⌘ ytraining
i ht (x
error errori ))error
rate of8iht=
rate : 1,
of
rate ht.of
:. . h ,n t:
5:
5: Choose
Choose t = 1 ln ⇣ 1 ✏t ⌘
7: XNormalize X X t = wt+1 1 lnto 1be
2
2
✏ t
✏t✏ta distribution:
5:
6: Choose
Update all t w = instance ln weights:
✏7: Normalize 2w to be t a distribution:
✏weights:
t = ✏t =
+
6: ✏t =
Update wt,iall w t,iinstance
t+1 t,i
6: i:y 6=Update all instance wt+1,iweights:
htw
i i:y i 6=
(x
wt+1,i h
)i:y(x 6
=
t+1,i = wt,inw
i t i =
i )h tP(x i )
expt+1,i( t yyi h h8i (x=i ))))1, . .8i 8i.,n = 1, 1, .. .. .. ,, n
n
w = w P exp w ( t (x =
wt+1,i t+1,i ⇣ = t,i ⇣n⌘exp⇣
j=1 ⌘ t+1,j t i 8i
t =i 1, . .
⌘yi ht (xi )) 8i = 1, . . . , n. , n + –
wt+1,i = w t,i j=1
✏t11 ✏t 1 ✏t
w ( t+1,j t
5:
7:Choose
hoose t = 2
Choose
Normalize
1
t =ln 12t1ln =
w✏ 2 ln
t+1 to be
✏tto be
✏t a a distribution:
distribution:
8:
7: endNormalize
for wt t+1
7:Update
8:
pdate
6:
9: endall
Return
9: Return
for
Normalize
Update
w
all the
instance
t+1,i
instance
the
wt+1,i = nwt+1,i =
w
allhypothesis
Pnwt+1,i
hypothesis
P
w to be
t+1weights:
weights:
instance
t+1,i
a distribution:
weights:
8i =
8i = 1,
!
1, .. .. .. ,, n n +
wt+1,iwt+1,i
=w = w=
exp =(exp
wP (yexp
n i w
hw (yTt+1,j
(x ih tt(x
i ))
t+1,j yiih))8i
t (x= i8i 1,=
))1, .. ..8i
1, .n
.. ,, n=. .1, , n. . . , n
+ –
wt+1,i j=1 ttX 8i =
t+1,it,i
t,i t,itj=1 !
H(x) = sign j=1 w X Tt+1,jt ht (x)
8: endNormalize
8:Normalize
ormalize
7: end for
wH(x)
for t+1wto t+1 =be wsign
tot+1abe toat=1 at hdistribution:
distribution:
distribution:
be t (x)
9:
8: Return
end
9: Return for the the hypothesishypothesis t=1
w w w
w9: Return t+1,i t+1,i t+1,i
wt+1,i
t+1,i =w P =nthe
t+1,i P = hypothesis
n Pn 8iT=8i 1,= . .8i
1,
.,n !. .1,
.=
! , n. . . , n
w w
t+1,jj=1 t+1,jXw X Tt+1,j
j=1 j=1
H(x) = sign T h (x) !
H(x) = sign X t ht (x) t t
for
end for for
8: end H(x) = sign t=1 t=1 t ht (x)
urn
Return the hypothesis
9: Return the hypothesisthe hypothesis t=1
41  
thethe number of iterations iT i i=1 ⇥ ⇤
4: Compute weighted training error rate of ⇥ 1 ht : 1⇤
1: Initialize aa the
1: Initialize vector
Xnumber
vector of n
of n uniform
uniform
of iterations weights
weights T w w1 = = ⇥ n1 ,, .. .. .. ,, n1 ⇤

AdaBoost
X 1
2:
1: for t
Initialize✏=t = 1, a. . . ,
vector T wt,iof n uniform weights w 1 = n
1
, . . . , n1
2: for t ✏=t = 1, . . . , T wt,i n n
3:
2: for Train
t = 1, model
i:y.i.6= . h,tT (xh i )t on X, y with instance weights wt  
3: INPUT:Train model hii )t yon =X, y with instance
n
,i ,ni=1 weights
n
, of hw
y{(x {(x i ,{(x
t=T
INPUT:
PUT: training training
i:y training
idata
i 6=htt (x data
X, X,
data =
X, ,yyi= )} yi )} yi )} , rate t
4:
3:
4: the Compute
Train
Compute model the
the h weighted
t on⇣
weighted X, y ⌘ training
iwith
training error
instance
i=1
error weights
i=1
rate of htt : t w :
5:
4: Choose
ComputeX the
number number
the tthe
ofnumber of ⇣
1iterationsiterations ⌘
ln 1 ✏t✏ training error
= 21weighted 1 of✏titerations
T T T ⇥ 1 rate ⇥ 1 1of ⇥⇤1 h1t :⇤ 1 ⇤
5:
Initialize
1: Initialize
alize Choose
a vectora✏tvector = a vector
of Xnt of= n2wln
uniform uniform
of n weights t
uniform
t
✏weights: weights wweights
1 = w1 n=,w , .n. .n,,n. . . , n
. .1n. =
6: Update all X instance t,i t
. t, ✏✏T
.=tt. .= wt,i t
for tfor
2:1,
=6: = 1,
. . Update ,i:y
1,
= T.i.all 6=.h,tT instance
(xiw ) weights:
i:yi 6=ht (xi ) t,i
3:Train
rain model model
Train hi:y on
t model h X, on hyi )tX, ony(X,
with with
instance instance
yt ywith weights
instance weights
8i w t1,w
=weights wt

++
wt+1,i i= 6=thw t (xt,i exp ⇣ ⌘ i ht (xi )) . .t. , n
Compute
ompute
4: wt+1,i
the
Compute the=weighted
weighted w
the exp
training
t,i1weighted 1 ✏tt⌘
⇣(training ytraining
i ht (x
error errori ))error
rate of8iht=
rate : 1,
of
rate ht.of
:. . h ,n t:
5:
5: Choose
Choose t = 1 ln ⇣ 1 ✏t ⌘
7: XNormalize X X t = wt+1 1 lnto 1be
2
2
✏ t
✏t✏ta distribution:
5:
6:
✏7:
t = ✏t =
6:
Choose
Update
Normalize
✏t =
Update
6: i:y 6=Update
wt,iall all
all
wt w =
t,i
P
instance
instance
t+1
instance
ln
2w to be
t,i
weights:
t a distribution:
✏weights:
wt+1,iweights: ++ +
htw
i i:y i 6=
(x
wt+1,i h
)i:y(x 6
=
t+1,i = wt,inw
i t i =
i )h t (x i )
expt+1,i
w ( t yyi h h8i (x=i ))
t (x ))1, . .8i 8i.,n = 1, 1, .. .. .. ,, n
n
+
wwt+1,i t+1,i ⇣ = = t,iw P ⇣n⌘exp⇣
j=1exp (
⌘ t+1,j t i 8i
t =i 1, . .
⌘yi ht (xi )) 8i = 1, . . . , n. , =
n +
Choose
hoose
5:
7:
wt+1,i
t = 2
Choose
Normalize
1
t =
=
ln 12t1ln
w
=
w
t,i j=1
✏t11 ✏t 1 ✏t
2 ln
w (
to be
t+1,j
be
t
a distribution:
distribution: +
+ +
8:
7: endNormalize
for ✏ t+1
wt t+1 ✏tto ✏t a
7:Update
8:
pdate
6:
9: endall for
Normalize
Update
Return all the
instance instance w
allhypothesisweights:
instance to be
t+1weights: a distribution:
weights:
w t+1,i
9: Return w t+1,ithe = Pnwt+1,i
hypothesis
P 8i = = 1, 1, .. .. .. ,, n n
wt+1,i = nwt+1,i 8i !
wt+1,iwt+1,i
=w wt+1,i= w=
exp
t+1,it,i
t,i =(exp
wP (yexp
n i w
j=1
t,itj=1 hwttX(yTt+1,j
(x ih tt(x
i ))
t+1,j yiih))8i
t (x
8i = =
!i8i 1,=
))1, .. ..8i
1, .n
.. ,, n=. .1, , n. . . , n
H(x) = sign j=1 w X Tt+1,jt ht (x)
8: endNormalize
8:Normalize
ormalize
7: end for
wH(x)
for t+1wto t+1 =be wsign
tot+1abe toat=1 at hdistribution:
distribution:
distribution:
be t (x)
9:
8: Return
end
9: Return for the the hypothesishypothesis t=1
w w w
w9: Return t+1,i t+1,i t+1,i
wt+1,i
t+1,i =w P =nthe
t+1,i P = hypothesis
n Pn 8iT=8i 1,= . .8i
1,
.,n !. .1,
.=
! , n. . . , n
w w
t+1,jj=1 t+1,jXw X Tt+1,j
j=1 j=1
H(x) = sign T h (x) !
H(x) = sign X t ht (x) t t
for
end for for
8: end H(x) = sign t=1 t=1 t ht (x)
urn
Return the hypothesis
9: Return the hypothesisthe hypothesis t=1
42  
thethe number of iterations iT i i=1 ⇥ ⇤
4: Compute weighted training error rate of ⇥ 1 ht : 1⇤
1: Initialize aa the
1: Initialize vector
Xnumber
vector of n
of n uniform
uniform
of iterations weights
weights T w w1 = = ⇥ n1 ,, .. .. .. ,, n1 ⇤

AdaBoost
X 1
2:
1: for t
Initialize✏=t = 1, a. . . ,
vector T wt,iof n uniform weights w 1 = n
1
, . . . , n1
2: for t ✏=t = 1, . . . , T wt,i n n
3:
2: for Train
t = 1, model
i:y.i.6= . h,tT (xh i )t on X, y with instance weights wt  
3: INPUT:Train model hii )t yon =X, y with instance
n
,i ,ni=1 weights
n
, of hw
y{(x {(x i ,{(x
t=T
INPUT:
PUT: training training
i:y training
idata
i 6=htt (x data
X, X,
data =
X, ,yyi= )} yi )} yi )} , rate t
4:
3:
4: the Compute
Train
Compute model the
the h weighted
t on⇣
weighted X, y ⌘ training
iwith
training error
instance
i=1
error weights
i=1
rate of htt : t w :
5:
4: Choose
ComputeX the
number number
the tthe
ofnumber of ⇣
1iterationsiterations ⌘
ln 1 ✏t✏ training error
= 21weighted 1 of✏titerations
T T T ⇥ 1 rate ⇥ 1 1of ⇥⇤1 h1t :⇤ 1 ⇤
5:
Initialize
1: Initialize
alize Choose
a vectora✏tvector = a vector
of Xnt of= n2wln
uniform uniform
of n weights t
uniform
t
✏weights: weights wweights
1 = w1 n=,w , .n. .n,,n. . . , n
. .1n. =
6: Update all X instance t,i t
. t, ✏✏T
.=tt. .= wt,i t
for tfor
2:1,
=6: = 1,
. . Update ,i:y
1,
= T.i.all 6=.h,tT instance
(xiw ) weights:
i:yi 6=ht (xi ) t,i
3:Train
rain model model
Train hi:y on
t model h X, on hyi )tX, ony(X,
with with
instance instance
yt ywith weights
instance weights
8i w t1,w
=weights wt

++
wt+1,i i= 6=thw t (xt,i exp ⇣ ⌘ i ht (xi )) . .t. , n
Compute
ompute
4: wt+1,i
the
Compute the=weighted
weighted w
the exp
training
t,i1weighted 1 ✏tt⌘
⇣(training ytraining
i ht (x
error errori ))error
rate of8iht=
rate : 1,
of
rate ht.of
:. . h ,n t:
5:
5: Choose
Choose t = 1 ln ⇣ 1 ✏t ⌘
7: XNormalize X X t = wt+1 1 lnto 1be
2
2
✏ t
✏t✏ta distribution:
5:
6:
✏7:
t = ✏t =
6:
Choose
Update
Normalize
✏t =
Update
6: i:y 6=Update
wt,iall all
all
wt w =
t,i
P
instance
instance
t+1
instance
ln
2w to be
t,i
weights:
t a distribution:
✏weights:
wt+1,iweights: ++ +
htw
i i:y i 6=
(x
wt+1,i h
)i:y(x 6
=
t+1,i = wt,inw
i t i =
i )h t (x i )
expt+1,i
w ( t yyi h h8i (x=i ))
t (x ))1, . .8i 8i.,n = 1, 1, .. .. .. ,, n
n
+
wwt+1,i t+1,i ⇣ = = t,iw P ⇣n⌘exp⇣
j=1exp (
⌘ t+1,j t i 8i
t =i 1, . .
⌘yi ht (xi )) 8i = 1, . . . , n. , =
n +
Choose
hoose
5:
7:
wt+1,i
t = 2
Choose
Normalize
1
t =
=
ln 12t1ln
w
=
w
t,i j=1
✏t11 ✏t 1 ✏t
2 ln
w (
to be
t+1,j
be
t
a distribution:
distribution: +
+ +
8:
7: endNormalize
for ✏ t+1
wt t+1 ✏tto ✏t a
7:Update
8:
pdate
6:
9: endall for
Normalize
Update
Return all the
instance instance w
allhypothesisweights:
instance to be
t+1weights: a distribution:
weights:
w t+1,i
9: Return w t+1,ithe = Pnwt+1,i
hypothesis
P 8i = = 1, 1, .. .. .. ,, n n
wt+1,i = nwt+1,i 8i !
wt+1,iwt+1,i
=w wt+1,i= w=
exp
t+1,it,i
t,i =(exp
wP (yexp
n i w
j=1
t,itj=1 hwttX(yTt+1,j
(x ih tt(x
i ))
t+1,j yiih))8i
t (x
8i = =
!i8i 1,=
))1, .. ..8i
1, .n
.. ,, n=. .1, , n. . . , n
H(x) = sign j=1 w X Tt+1,jt ht (x)
8: endNormalize
8:Normalize
ormalize
7: end for
wH(x)
for t+1wto t+1 =be wsign
tot+1abe toat=1 at hdistribution:
distribution:
distribution:
be t (x)
9:
8: Return
end
9: Return for the the hypothesishypothesis t=1
w w w
w9: Return t+1,i t+1,i t+1,i
wt+1,i
t+1,i =w P =nthe
t+1,i P = hypothesis
n Pn 8iT=8i 1,= . .8i
1,
.,n !. .1,
.=
! , n. . . , n
w w
t+1,jj=1 t+1,jXw X Tt+1,j
j=1 j=1
H(x) = sign T h (x) !
H(x) = sign X t ht (x) t t
for
end for for
8: end H(x) = sign t=1 t=1 t ht (x)
urn
Return the hypothesis
9: Return the hypothesisthe hypothesis t=1
43  
thethe number of iterations iT i i=1 ⇥ ⇤
4: Compute weighted training error rate of ⇥ 1 ht : 1⇤
1: Initialize aa the
1: Initialize vector
Xnumber
vector of n
of n uniform
uniform
of iterations weights
weights T w w1 = = ⇥ n1 ,, .. .. .. ,, n1 ⇤

AdaBoost
X 1
2:
1: for t
Initialize✏=t = 1, a. . . ,
vector T wt,iof n uniform weights w 1 = n
1
, . . . , n1
2: for t ✏=t = 1, . . . , T wt,i n n
3:
2: for Train
t = 1, model
i:y.i.6= . h,tT (xh i )t on X, y with instance weights wt  
3: INPUT:Train model hii )t yon =X, y with instance
n
,i ,ni=1 weights
n
, of hw
y{(x {(x i ,{(x
t=T
INPUT:
PUT: training training
i:y training
idata
i 6=htt (x data
X, X,
data =
X, ,yyi= )} yi )} yi )} , rate t
4:
3:
4: the Compute
Train
Compute model the
the h weighted
t on⇣
weighted X, y ⌘ training
iwith
training error
instance
i=1
error weights
i=1
rate of htt : t w :
5:
4: Choose
ComputeX the
number number
the tthe
ofnumber of ⇣
1iterationsiterations ⌘
ln 1 ✏t✏ training error
= 21weighted 1 of✏titerationsT T T ⇥ 1 rate ⇥ 1 1of ⇥⇤1 h1t :⇤ 1 ⇤
5:
Initialize
1: Initialize
alize Choose
a vectora✏tvector = a vector
of Xnt of= n2wln
uniform uniform
of n weights t
uniform
t
✏weights: weights wweights
1 = w1 n=,w , .n. .n,,n. . . , n
. .1n. =
6: Update all X instance t,i t
. t, ✏✏T
.=tt. .= wt,i t
for tfor
2:1,
=6: = 1,
. . Update ,i:y
1,
= T.i.all 6=.h,tT instance
(xiw ) weights:
i:yi 6=ht (xi ) t,i
3:Train
rain model model
Train hi:y on
t model h X, on hyi )tX, ony(X,
with with
instance
yt ywith instance weights
instance weights
8i w t1,w
=weights wt

++
wt+1,i i= 6=thw t (xt,i exp ⇣ ⌘ i ht (xi )) . .t. , n
Compute
ompute
4: wt+1,i
the
Compute the=weighted
weighted w
the exp
training
t,i1weighted 1 ✏tt⌘
⇣(training ytraining
i ht (x
error errori ))error
rate of8iht=
rate : 1,
of
rate ht.of
:. . h ,n t:
5:
5: Choose
Choose t = 1 ln ⇣ 1 ✏t ⌘
7: XNormalize X X t = wt+1 1 lnto 1be
2
2
✏ t
✏t✏ta distribution:
5:
6:
✏7:
t = ✏t =
6:
Choose
Update
Normalize
✏t =
Update
6: i:y 6=Update
wt,iall all
all
wt w =
t,i
P
instance
instance
t+1
instance
ln
2w to be
t,i
weights:
t a distribution:
✏weights:
wt+1,iweights: ++ +
htw
i i:y i 6=
(x
wt+1,i h
)i:y(x 6
=
t+1,i = wt,inw
i t i =
i )h t (x i )
expt+1,i
w ( t yyi h h8i (x=i ))
t (x ))1, . .8i 8i.,n = 1, 1, .. .. .. ,, n
n
+
wwt+1,i t+1,i ⇣ = = t,iw P ⇣n⌘exp⇣
j=1exp (
⌘ t+1,j t i 8i
t =i 1, . .
⌘yi ht (xi )) 8i = 1, . . . , n . , =
n +
Choose
hoose
5:
7:
wt+1,i
t = 2
Choose
Normalize
1
t =
=
ln 12t1ln
w
=
w
t,i j=1
✏t11 ✏t 1 ✏t
2 ln
w (
to be
t+1,j
be
t
a distribution:
distribution: +
+ +
8:
7: endNormalize
for ✏ t+1
wt t+1 ✏tto ✏t a
7:Update
8:
pdate
6:
9: endall for
Normalize
Update
Return all the
instance instance w
allhypothesisweights:
instance to be
t+1weights: a distribution:
weights:
w t+1,i
9: Return w t+1,ithe = Pnwt+1,i
hypothesis
P 8i = = 1, 1, .. .. .. ,, n n
wt+1,i = nwt+1,i 8i !
wt+1,iwt+1,i
=w wt+1,i= w=
exp
t+1,it,i
t,i =(exp
wP (yexp
n i w
j=1
t,itj=1 hwttX(yTt+1,j
(x ih tt(x
i ))
t+1,j yiih))8i
t (x
8i = =
!i8i 1,=
))1, .. ..8i
1, .n
.. ,, n=. .1, , n. . . , n
H(x) = sign j=1 w X Tt+1,jt ht (x)
8: endNormalize
8:Normalize
ormalize
7: end for
wH(x)
for t+1wto t+1 =be wsign
tot+1abe toat=1 at hdistribution:
distribution:
distribution:
be t (x)
9:
8: Return
end
9: Return for the the hypothesishypothesis t=1
w w w
w9: Return t+1,i t+1,i t+1,i
wt+1,i
=w P =nthe P = hypothesis
n Pn 8iT=8i 1,= . .8i
1,
.,n !. .1,
.= , n. . . , n
t+1,i
•  Final   t+1,i
j=1 j=1 wm odel  
w
t+1,jj=1 t+1,jXw i
X s  
Tt+1,ja   w eighted  
!
! combinaIon  of  members  
H(x) = = sign sign X T th ht (x)
(x)
for
end for for
8: end – H(x)
H(x) Each   = sign member   t=1 w
t
t heighted  
t
t (x) by  its  importance  
t=1
urn
Return the hypothesis
9: Return the hypothesisthe hypothesis t=1
44  
INPUT: training data X, y = {(xi , yi )}ni=1 ,

AdaBoost    
INPUT: the training number dataofX, y = {(xiT, yi )}ni=1
iterations n , ⇥
INPUT: training data X, y = {(x , y
i , yi )}n)} ,, 1 ⇤
1: INPUT:
Initialize
INPUT: the a training
the vector
training number ofdata
n
dataofX,of X,
uniform y =
iterations {(x
weights
iT
y = {(xiT, yi )}i=1 i w
i=1
n =
1 , ⇥n
i=1 , . . . , 1

number iterations n
1:
2: Initialize
INPUT:
for t = 1, a. the
.
the vector
training
. , number ofdata
[Freund  
Tnumber & of
n  Suniform
of X, iterations {(xiT
weights
y =1997]  
chapire,  
iterations T, yi )} n
wi=1
1 = ⇥ 1
, ⇥ n1 , . . . , n1 ⇤⇤ 1
1:
1: Initialize
INPUT:
Initialize aa training
vector
vector of
of n
data
n uniform
X,
uniform y = weights
{(xiT, yi )}
weights w
w
n1 =
= , ⇥ n11 ,, .. .. .. ,, n11 ⇤
2:
3: for t
Train= 1, .
model.
the . , Tnumber
h on X, of y iterations
with instance i=1
weights
, ⇥ nn1 , w
1: Initialize a. training
vector t ofdata
n uniform 1 . . t. , nn1 ⇤
2: INPUT:
for t = 1, .
the . , Tnumber of X, y = weights
iterations {(xiT , yi )} wi=1
n1 =
2:
3:
1:
4:
2: for t
Train=
Initialize
forCompute
t = 1,
1, .
model. .
a. .vector
. ,
the
, T
T hweighted
t ofonnX, y training
uniform with weights
instance
error weights
wrate
1 = of ⇥ n1h, w ..., ⇤
t: t n
3:
1: Train
Initialize modelthe
a. .vector number
h t on
of n X, of y
uniformiterations
with weightsT
instance w weights
= ⇥ , w
. . t. , 1 ⇤
3:
4:
2:
3: Train
forCompute
t
Train= 1,model
model X. the
, T h
h on
weighted X, y with
training instance
t on X, y with instance weights error weights
rate
1 of n
1 wh w :
t t n
1:
4:
2: Initialize
Compute
forCompute
t = 1, a . . vector
. the
, T t of n
weighted uniform training weights
error w =
rate
1 of h, .
t .
: t. , 1
4:
3:
4: Train
✏=t = model X the hw weighted
on X, y training
t t,i with instance error rate
weightsof nhw t :: t n
2:
3: forCompute
t
Train 1, .
model. X. the
, T h weighted
on X, y training
with error
instance rate
weightsof h t
w
4: ✏t =i:y
Compute i 6= X the
ht (xi ) wweighted
t t,i training error rate of h t: t
3:
4: ✏✏t =
Train
Compute model X hw on X, y training with instance weights of hw
hthe weighted error rate t: t
= t t,i
t i:yi 6= X t (xiw ) t,i
4: ✏
Compute =
t i:yi 6=hthe w
t (xiweighted
) t,i ⇣ ⌘
training error rate of ht :
5: ✏
Choose t =i:yi 6=
i:yi 6=tht=
X ht (xi1w ) t,i ⇣ 1 ✏t ⌘
(xi2) ln
5: ✏
Choose t = X = (xi1w t,i ⇣
) ln 1 ✏t✏t ⌘
6:
i:y
✏t =i:yiall
Update
6
=
i t t h ⇣
1wt,i ⇣ 1 ✏weights:
instance
2 t✏t ⌘

5:
5: Choose
Choose =
6=tht (xi2)1 ln 1 ✏
t = ln t
6: Update all instance
1 ⇣ 1 ✏✏weights:
t✏t ⌘
5:
6: Choose
Update i:yi 6=tht (xi1
all = 2
2)
instanceln t
5:
6: Choose
Update
w all
= w=instanceln
exp ⇣(1 ✏weights:
t✏t ⌘
weights: 8i = 1, . . . , n
t✏tt yi ht (xi ))
t
t+1,i all instance 2
t,i1 ✏
6:
5: Update
Choose = ln ⇣ 1 weights: ⌘
6: wt+1,i all
Update =t winstance 2 exp (1 ✏weights:
t,i1 t✏tt yi ht (xi )) 8i = 1, . . . , n
5: Choose
w t+1,i all = = ln
t wt,i2 exp ( ✏t t yi ht (xi )) 8i = 1, .. .. .. ,, n
6:
7: Update
w
Normalize
t+1,i = wwinstance
t,i exp to ( weights:
be tay h (x ))
distribution:
i t i 8i = 1, n
6: w
Update t+1,i =
all w t+1
t,i exp
instance ( t
weights:y i h t (x i )) 8i = 1, . . . , n
7: Normalize
w t+1,i = ww t+1
t,i exp to ( be ay distribution:
h (x ))
ta idistribution:
t i 8i = 1, . . . , n
7: Normalize w t+1 w to be
7: w
Normalize =w w exp to ( be tayidistribution:
h8i
t (x= 1, . .8i
i )) = 1, . . . , n
t+1,i
7: wt+1,i
t+1,i = w
Normalize P t,i
t+1
n w to be a distribution: .,n
7: wwt+1,i
Normalize = wP t+1 exp
t,ij=1 w(t+1,j
t+1,i tayidistribution:
h8i
t (x=i )) 8i = 1, . . . , n
t+1,i = w nw
t+1 to
t+1,i
w to be 1, . . . , n
7: w t+1,i =
Normalize wP
P nwt+1,i wt+1,j 8i
be a distribution: = 1, .. .. .. ,, n
w
w t+1,i =
= P
j=1
t+1
n t+1,i
w t+1,j
8i
8i =
= 1,
1, . . . , n
n
7:
8: Normalize
end for t+1,i w w
j=1
n
t+1 tow
t+1,i be
t+1,j a distribution:
wt+1,i = nj=1 P j=1
wt+1,iw t+1,j 8i = 1, . . . , n
8:
9: end
Return for the
wt+1,i = j=1 hypothesis
P w t+1,j 8i = 1, . . . , n
8: end for nwt+1,i
8:
9: end
Return for the hypothesis
wt+1,i = Pnj=1 t+1,j 8i = w 1, . . . , n
8:
9: end
Return for the hypothesis w T
!
9:
8:
9: Return
end
Return for thethe hypothesis
j=1 X
hypothesis
t+1,j
!
8:
9: end for H(x)
Return the hypothesis = sign XT
T
h
t t (x) !
!
8:
9: end
Return for H(x)
the = sign X
hypothesis XT
t=1 t h t (x) ! 45  
T
INPUT: training data X, y = {(xi , yi )}ni=1 ,
INPUT: the training number dataofX, y = {(xiT, yi )}ni=1
iterations n , ⇥

AdaBoost
INPUT: training data X, y = {(x , y
i , yi )}n)} ,, 1 ⇤
1: INPUT:
Initialize
INPUT: the a training
the vector
training number ofdata
n of X,
uniform
dataofX, y =
iterations {(x
weights
iT
y = {(xiT, yi )}i=1 i w
i=1
n =
1 , ⇥n
i=1 , . . . , 1

number iterations   n
1:
2: Initialize
INPUT:
for t = 1, a. the vector
training
.
the . , Tnumber
number ofdata of
n uniform
of X,iterations
iterations {(xiT
y = weights T, yi )} n
wi=1
1 = ⇥ 1
, ⇥ n1 , . . . , n1 ⇤⇤ 1
1:
1: Initialize
INPUT:
Initialize aa training
vector
vector of
of n
data
n uniform
X,
uniform y = weights
{(xiT, yi )}
weights w
w
n1 =
= , ⇥ n11 ,, .. .. .. ,, n11 ⇤
2:
3: for t
Train= 1, .
model.
the . , Tnumber
h on of
X, yiterations
with instance i=1
weights
, ⇥ nn1 , w
1: Initialize a. training
vector t ofdata
n uniform 1 . . t. , nn1 ⇤
2: INPUT:
for t = 1, .
the . , Tnumber of X, y = weights
iterations {(xiT , yi )} wi=1
n1 =
2:
3:
1:
4:
2: for t
Train=
Initialize
forCompute
t = 1,
1, .
model. .
a. .vector
. ,
the
, T
T hweighted
t ofonnX, y training
uniform with weights
instance
error weights
wrate
1 = of ⇥ n1h, w ..., ⇤
t: t n
3:
1: Train
Initialize modelthe
a. .vector number
h t on
of n of
X, yiterations
uniform with weightsT
instance w weights
= ⇥ , w
. . t. , 1 ⇤
3:
4:
2:
3: Train
forCompute
t
Train= 1,model
model X. the
, T h
h on
weighted X, y with
training instance
t on X, y with instance weights error weights
rate
1 of n
1 wh w :
t t n
1:
4:
2: Initialize
Compute
forCompute
t = 1, a . . vector
. the
, T t of n
weighted uniformtraining weights
error w =
rate
1 of h, .
t .
: t. , 1
4:
3:
4: Train
✏=t = model X the hw weighted
on X, y training
t t,i with instance error rate
weightsof nhw t :: t n
2:
3: forCompute
t
Train 1, .
model. X. the
, T h weighted
on X, y training
with error
instance rate
weightsof h t
w
4: ✏t =i:y
Compute i 6= X the
ht (xi ) wweighted
t t,i training error rate of h t: t
3:
4: ✏✏t =
Train
Compute model X hw on X, y training
with instance weights of hw
hthe weighted error rate t: t
= t t,i
t (xiw
training error rate of ht : wt  is  a  vector  of  weights  
t i:yi 6= X ) t,i
4: ✏
Compute =
t i:yi 6=hthe w
t (xiweighted
) t,i ⇣ ⌘
✏ =i:yi 6= X ht (xi1w ) t,i ⇣ 1 ✏t ⌘
over  the  instances  at  
5: Choose t i:yi 6=tht= (xi2) ln
5: ✏
Choose t = X = (xi1w t,i ⇣
) ln 1 ✏t✏t ⌘
6:
i:y
✏t =i:yiall
Update
6
=
i t t h
w
instance ⇣ ⌘
t,i ⇣ 1 weights:
1
2 1 ✏ ✏
t t⌘
5:
5:
6:
5:
Choose
Choose
Update
Choose all
6=tht=
i:yi 6=tht=
t = (xi1
2 ) ln
instance
1 ln
2) ln
(xi1
⇣ 1 ✏
✏t
✏t
t✏t ⌘
weights: iteraIon  t!
6:
5:
6: Update
Choose
Update all
all =instance
2
instanceln ⇣(1 ✏weights:
t✏t ⌘
weights:
 
w = t w exp y h (x )) 8i = 1, . . . , n
8i = 1, . . . , n All  points  start  with  equal  
6: Update t+1,i all instance 2
t,i1 ✏ t✏tt i t i
5: Choose = ln ⇣ 1 weights:

6: w
Update = t
t+1,i all instancew 2
t,i1 exp ( y
1 ✏weights:
t✏tt i t h (x i ))
5: Choose
w = = ln
t wt,i2 exp ( ✏t t yi ht (xi )) 8i = 1, .. .. .. ,, n
n weight  
6: Update
w t+1,i all = w instance
exp ( weights:
y h (x )) 8i = 1, n
7: Normalize
w t+1,i = ww t,i
t+1 exp to ( be tay distribution:
i h t (x i )) 8i = 1, . . . ,
6: Update t+1,i all t,i
instance t
weights:i t i
7: Normalize
wt+1,i = w wt,it+1 exp to( be tayidistribution:
ht (xi )) 8i = 1, . . . , n  
7: Normalize w t+1 w to be a distribution:
7: w
Normalize =w w exp to ( be tayidistribution:
h8i
t (x= 1, . .8i
i )) = 1, . . . , n
t+1,i
7: wt+1,i
t+1,i = w
Normalize P t,i
t+1
n w to be a distribution: .,n
7: wwt+1,i
Normalize = wP t+1 exp
t,ij=1 w(t+1,j
t+1,i tayidistribution:
h8i
t (x=i )) 8i = 1, . . . , n
t+1,i = w nw
t+1 to
t+1,i
w to be 1, . . . , n
7: w t+1,i =
Normalize wP
P nwt+1,i wt+1,j 8i =
be a distribution: 1, .. .. .. ,, n
w
w t+1,i =
= P
j=1
t+1
n t+1,i
w t+1,j
8i
8i =
= 1,
1, . . . , n
n
7:
8: Normalize
end for t+1,i w w
j=1
n
t+1 tow
t+1,i be
t+1,j a distribution:
wt+1,i = nj=1 P j=1
wt+1,iw t+1,j 8i = 1, . . . , n
8:
9: end
Return for the
wt+1,i = j=1 hypothesis
P w t+1,j 8i = 1, . . . , n
8: end for nwt+1,i
8:
9: end
Return for the hypothesis
wt+1,i = Pnj=1 t+1,j 8i = w 1, . . . , n
8:
9: end
Return for the hypothesis w T
!
9:
8:
9: Return
end
Return for thethe hypothesis
j=1 X
hypothesis
t+1,j
!
8:
9: end for H(x)
Return the hypothesis = sign XT
T
h
t t (x) !
!
8:
9: end
Return for H(x)
the = sign X
hypothesis XT
t=1 t h t (x) ! 46  
T
INPUT: training data X, y = {(xi , yi )}ni=1 ,
INPUT: the training number dataofX, y = {(xiT, yi )}ni=1
iterations n , ⇥

AdaBoost
INPUT: training data X, y = {(x , y
i , yi )}n)} ,, 1 ⇤
1: INPUT:
Initialize
INPUT: the a training
the vector
training number ofdata
n of X,
uniform
dataofX, y =
iterations {(x
weights
iT
y = {(xiT, yi )}i=1 i w
i=1
n =
1 , ⇥n
i=1 , . . . , 1

number iterations   n
1:
2: Initialize
INPUT:
for t = 1, a. the vector
training
.
the . , Tnumber
number ofdata of
n uniform
of X, iterations
iterations {(xiT
y = weights T, yi )} n
wi=1
1 = ⇥ 1
, ⇥ n1 , . . . , n1 ⇤⇤ 1
1:
1: Initialize
INPUT:
Initialize aa training
vector
vector of
of n
data
n uniform
X,
uniform y = weights
{(xiT, yi )}
weights w
w
n1 =
= , ⇥ n11 ,, .. .. .. ,, n11 ⇤
2:
3: for t
Train= 1, .
model.
the . , Tnumber
h on X,of y iterations
with instance i=1
weights
, ⇥ nn1 , w
1: Initialize a. training
vector t ofdata
n uniform 1 . . t. , nn1 ⇤
2: INPUT:
for t = 1, .
the . , Tnumber of X, y = weights
iterations {(xiT , yi )} wi=1
n1 =
2:
3:
1:
4:
2: for t
Train=
Initialize
forCompute
t = 1,
1, .
model. .
a. .vector
. ,
the
, T
T hweighted
onnX,
t of y training
uniform with weights
instance
error weights
wrate
1 = of ⇥ n1h, w ..., ⇤
t: t n
3:
1: Train
Initialize modelthe
a. .vector number
h t on
of n X,of y
uniform iterations
with weightsT
instance w weights
= ⇥ , w
. . t. , 1 ⇤
3:
4:
2:
3: Train
forCompute
t
Train= 1,model
model X. the
, T h
h on
weighted X, y with
training instance
t on X, y with instance weights error weights
rate
1 of n
1 wh w :
t t n
1:
4:
2: Initialize
Compute
forCompute
t = 1, a . . vector
. the
, T t of n
weighted uniform training weights
error w =
rate
1 of h, .
t .
: t. , 1
4:
3:
4: Train
✏=t = model X the hw weighted
on X, y training
t t,i with instance error rate
weightsof nhw t :: t n
2:
3: forCompute
t
Train 1, .
model. X. the
, T h weighted
on X, y training
with error
instance rate
weightsof h t
w
4: ✏t =i:y
Compute i 6= X the
ht (xi ) wweighted
t t,i training error rate of h t: t
3:
4: ✏✏t =
Train
Compute model X hw on X, y training with instance weights of hw
hthe weighted error rate t: t
= t t,i
t i:yi 6= X t (xiw ) t,i
4: ✏
Compute =
t i:yi 6=hthe w
t (xiweighted
) t,i ⇣ ⌘
training error rate of ht :
5: ✏
Choose t =i:yi 6=
i:yi 6=tht=
X ht (xi1w ) t,i ⇣ 1 ✏t ⌘
(xi2) ln
5: ✏
Choose t = X = (xi1w t,i ⇣
) ln 1 ✏t✏t ⌘
6:
i:y
✏t =i:yiall
Update
6
=
i t t h ⇣
1wt,i ⇣ 1 ✏weights:
instance t✏t ⌘

5:
5:
6:
5:
Choose
Choose
Update
Choose all
=
6=tht (xi2)
t = =
2
1
instance
1
2
ln
ln
ln ⇣
1 ✏ t
1 ✏✏weights:
t✏t ⌘ We  need  a  way  to  weight  instances  
i:yi 6=tht (xi1 2)
t
6:
5:
6:
6:
Update
Choose
Update
w
Update
all
all
= t
t+1,i all instancew=instance
instance
2
t,i1
ln
exp ⇣(1 ✏weights:
⇣ ✏
t✏t ⌘
weights:
t✏tt yi ht (xi ))
1 weights: ⌘
differently  
8i = 1, . . . ,w n hen  learning  the  model...  
5: Choose
wt+1,i all =t w =t,i2 expln (1 ✏t✏ t yi ht (xi ))   8i = 1, . . . , n
6:
5: Update
Choose
w = instance
1
=t,i2 expln ( ✏weights:
t
8i
t+1,i t w ty ih t (x i )) = 1, .. .. .. ,, n
6:
7: Update
w
Normalize
t+1,i all
= wwinstance
t,i expto ( beweights:
t
tay h (x ))
distribution:
i t i 8i = 1, n
6: w
Update t+1,i =
all w t+1
t,i exp
instance ( t y
weights: i h t (x i )) 8i = 1, . . . , n
7: Normalize
w t+1,i = ww t+1
t,i expto ( be ay distribution:
h (x ))
ta idistribution:
t i 8i = 1, . . . , n
7: Normalize w t+1 w to be
7: w
Normalize =w w expto ( be tayidistribution:
h8i
t (x= 1, . .8i
i )) = 1, . . . , n
t+1,i
7: wt+1,i
t+1,i = w
Normalize P t,i
t+1
n w to be a distribution: .,n
7: wwt+1,i
Normalize = wP t+1 exp
t,ij=1 w(t+1,j
t+1,i tayidistribution:
h8i
t (x=i )) 8i = 1, . . . , n
t+1,i = w nw
t+1 to
t+1,i
w to be 1, . . . , n
7: w t+1,i =
Normalize wP
P nwt+1,iwt+1,j 8i
be a distribution: = 1, .. .. .. ,, n
w
w t+1,i =
= P
j=1
t+1
n t+1,i
w t+1,j
8i
8i =
= 1,
1, . . . , n
n
7:
8: Normalize
end for t+1,i w w
j=1
n
t+1 tow
t+1,i be
t+1,j a distribution:
wt+1,i = nj=1 P j=1 w
wt+1,i t+1,j 8i = 1, . . . , n
8:
9: end
Return for the
wt+1,i = j=1 hypothesis
P w t+1,j 8i = 1, . . . , n
8: end for nwt+1,i
8:
9: end
Return for the hypothesis
wt+1,i = Pnj=1 t+1,j 8i = w 1, . . . , n
8:
9: end
Return for the hypothesis w T
!
9:
8:
9: Return
end
Return for thethe hypothesis
j=1 X
hypothesis
t+1,j
!
8:
9: end for H(x)
Return the hypothesis = sign XT
T
h
t t (x) !
!
8:
9: end
Return for H(x)
the = sign X
hypothesis XT
t=1 t h t (x) ! 47  
T
Training  a  Model  with  Weighted  Instances  
•  For  algorithms  like  logisIc  regression,  can  simply  
incorporate  weights  w  into  the  cost  funcIon  
–  EssenIally,  weigh  the  cost  of  misclassificaIon  differently  
for  each  instance  
Xn
Jreg (✓) = wi [yi log h✓ (xi ) + (1 yi ) log (1 h✓ (xi ))] + k✓[1:d] k22
i=1

•  For  algorithms  that  don’t  directly  support  instance  


weights  (e.g.,  ID3  decision  trees,  etc.),  use  weighted  
bootstrap  sampling  
–  Form  training  set  by  resampling  instances  with  
replacement  according  to  w  
48  
Base  Learner  Requirements  
•  AdaBoost  works  best  with  “weak”  learners  
–  Should  not  be  complex  
–  Typically  high  bias  classifiers  
–  Works  even  when  weak  learner  has  an  error  rate  just  
slightly  under  0.5      (i.e.,  just  slightly  beFer  than  random)  
•  Can  prove  training  error  goes  to  0  in  O(log  n)  iteraIons  

•  Examples:  
–  Decision  stumps  (1  level  decision  trees)  
–  Depth-­‐limited  decision  trees  
–  Linear  classifiers  

49  
INPUT: training data X, y = {(xi , yi )}ni=1 ,
INPUT: the training number dataofX, y = {(xiT, yi )}ni=1
iterations n , ⇥

AdaBoost
INPUT: training data X, y = {(x , y
i , yi )}n)} ,, 1 ⇤
1: INPUT:
Initialize
INPUT: the a training
the vector
training number ofdata
n of X,
uniform
dataofX, y =
iterations {(x
weights
iT
y = {(xiT, yi )}i=1 i w
i=1
n =
1 , ⇥n
i=1 , . . . , 1

number iterations   n
1:
2: Initialize
INPUT:
for t = 1, a. the vector
training
.
the . , Tnumber
number ofdata of
n uniform
of X, iterations
iterations {(xiT
y = weights T, yi )} n
wi=1
1 = ⇥ 1
, ⇥ n1 , . . . , n1 ⇤⇤ 1
1:
1: Initialize
INPUT:
Initialize aa training
vector
vector of
of n
data
n uniform
X,
uniform y = weights
{(xiT, yi )}
weights w
w
n1 =
= , ⇥ n11 ,, .. .. .. ,, n11 ⇤
2:
3: for t
Train= 1, .
model.
the . , Tnumber
h on X,of y iterations
with instance i=1
weights
, ⇥ nn1 , w
1: Initialize a. training
vector t ofdata
n uniform 1 . . t. , nn1 ⇤
2: INPUT:
for t = 1, .
the . , Tnumber of X, y = weights
iterations {(xiT , yi )} wi=1
n1 =
2:
3:
1:
4:
2: for t
Train=
Initialize
forCompute
t = 1,
1, .
model. .
a. .vector
. ,
the
, T
T hweighted
onnX,
t of y training
uniform with weights
instance
error weights
wrate
1 = of ⇥ n1h, w ..., ⇤
t: t n
3:
1: Train
Initialize modelthe
a. .vector number
h t on
of n X,of y
uniform iterations
with weightsT
instance w weights
= ⇥ , w
. . t. , 1 ⇤
3:
4:
2:
3: Train
forCompute
t
Train= 1,model
model X. the
, T h
h on
weighted X, y with
training instance
t on X, y with instance weights error weights
rate
1 of n
1 wh w :
t t n
1:
4:
2: Initialize
Compute
forCompute
t = 1, a . . vector
. the
, T t of n
weighted uniform training weights
error w =
rate
1 of h, .
t .
: t. , 1
4:
3:
4: Train
✏=t = model X the hw weighted
on X, y training
t t,i with instance error rate
weightsof nhw t :: t n
2:
3: forCompute
t
Train 1, .
model. X. the
, T h weighted
on X, y training
with error
instance rate
weightsof h t
w
4: ✏t =i:y
Compute i 6= X the
ht (xi ) wweighted
t t,i training error rate of h t: t
3:
4: ✏✏t =
Train
Compute model X hw on X, y training with instance weights of hw
hthe weighted error rate t: t
= t t,i
t i:yi 6= X t (xiw ) t,i
4: ✏
Compute =
t i:yi 6=hthe w
t (xiweighted
) t,i ⇣ ⌘
training error rate of ht :
5: ✏
Choose t =i:yi 6=
i:yi 6=tht=
X ht (xi1w ) t,i ⇣ 1 ✏t ⌘
(xi2) ln
5: ✏
Choose t = X = (xi1w t,i ⇣
) ln 1 ✏t✏t ⌘
6:
i:y
✏t =i:yiall
Update
6
=
i t t h ⇣
1wt,i ⇣ 1 ✏weights:
instance t✏t ⌘

5:
5:
6:
5:
Choose
Choose
Update
Choose all
=
6=tht (xi2)
t = =
2
1
instance
1
2
ln
ln
ln ⇣
1 ✏ t
1 ✏✏weights:
t✏t ⌘ Error  is  the  sum  the  weights  of  all  
i:yi 6=tht (xi1 2)
t
6:
5:
6:
6:
Update
Choose
Update
w
Update
all
all
= t
t+1,i all instancew=instance
instance
2
t,i1
ln
exp ⇣(1 ✏weights:
⇣ ✏
t✏t ⌘
weights:
t✏tt yi ht (xi ))
1 weights: ⌘
misclassified  
8i = 1, . . . , n instances  
5: Choose
wt+1,i all =t w =t,i2 expln (1 ✏t✏ t yi ht (xi ))   8i = 1, . . . , n
6:
5: Update
Choose
w = instance
1
=t,i2 expln ( ✏weights:
t
8i
t+1,i t w ty ih t (x i )) = 1, .. .. .. ,, n
6:
7: Update
w
Normalize
t+1,i all
= wwinstance
t,i expto ( beweights:
t
tay h (x ))
distribution:
i t i 8i = 1, n
6: w
Update t+1,i =
all w t+1
t,i exp
instance ( t y
weights: i h t (x i )) 8i = 1, . . . , n
7: Normalize
w t+1,i = ww t+1
t,i expto ( be ay distribution:
h (x ))
ta idistribution:
t i 8i = 1, . . . , n
7: Normalize w t+1 w to be
7: w
Normalize =w w expto ( be tayidistribution:
h8i
t (x= 1, . .8i
i )) = 1, . . . , n
t+1,i
7: wt+1,i
t+1,i = w
Normalize P t,i
t+1
n w to be a distribution: .,n
7: wwt+1,i
Normalize = wP t+1 exp
t,ij=1 w(t+1,j
t+1,i tayidistribution:
h8i
t (x=i )) 8i = 1, . . . , n
t+1,i = w nw
t+1 to
t+1,i
w to be 1, . . . , n
7: w t+1,i =
Normalize wP
P nwt+1,iwt+1,j 8i
be a distribution: = 1, .. .. .. ,, n
w
w t+1,i =
= P
j=1
t+1
n t+1,i
w t+1,j
8i
8i =
= 1,
1, . . . , n
n
7:
8: Normalize
end for t+1,i w w
j=1
n
t+1 tow
t+1,i be
t+1,j a distribution:
wt+1,i = nj=1 P j=1 w
wt+1,i t+1,j 8i = 1, . . . , n
8:
9: end
Return for the
wt+1,i = j=1 hypothesis
P w t+1,j 8i = 1, . . . , n
8: end for nwt+1,i
8:
9: end
Return for the hypothesis
wt+1,i = Pnj=1 t+1,j 8i = w 1, . . . , n
8:
9: end
Return for the hypothesis w T
!
9:
8:
9: Return
end
Return for thethe hypothesis
j=1 X
hypothesis
t+1,j
!
8:
9: end for H(x)
Return the hypothesis = sign XT
T
h
t t (x) !
!
8:
9: end
Return for H(x)
the = sign X
hypothesis XT
t=1 t h t (x) ! 50  
T
INPUT: training data X, y = {(xi , yi )}ni=1 ,
INPUT: the training number dataofX, y = {(xiT, yi )}ni=1
iterations n , ⇥

AdaBoost
INPUT: training data X, y = {(x , y
i , yi )}n )} ,, 1 ⇤
1: INPUT:
Initialize
INPUT: the a training
the vector
training number ofdata
n of X,
uniform
dataofX, y =
iterations {(x
weightsiT
y = {(xiT, yi )}i=1 i w
i=1
n
i=1 1 , ⇥n = , . . . , 1

number iterations   n
1:
2: Initialize
INPUT:
for t = 1, a. the vector
training
.
the . , Tnumber
number ofdata of
n uniform
of X, iterations
iterations {(xiT
y = weights T, yi )} wi=1 n
1 = ⇥ 1
, ⇥ n1 , . . . , n1 ⇤⇤ 1
1:
1: Initialize
INPUT:
Initialize aa training
vector
vector of
of n
data
n uniform
X,
uniform y = weights
{(xiT, yi )}
weights w
w
n 1 = = , ⇥ n11 ,, .. .. .. ,, n11 ⇤
2:
3: for t
Train= 1, .
model.
the . , Tnumber
h on X,of y iterations
with instance i=1
weights , ⇥ nn1 , w
1: Initialize a. training
vector t ofdata
n uniform 1 . . t. , nn1 ⇤
2: INPUT:
for t = 1, .
the . , Tnumber of X, y = weights
iterations {(xiT , yi )} wi=1 n 1 =
2:
3:
1:
4:
2: for t
Train=
Initialize
forCompute
t = 1,
1, .
model. .
a. .vector
. ,
the
, T
T hweighted
t ofonnX, y training
uniform with weights
instance
error wrate weights
1 = of ⇥ n1h, w ..., ⇤
t: t n
3:
1: Train
Initialize modelthe
a. .vector number
h t on
of n X,of y
uniform iterations
with weights T
instance w weights = ⇥ , w
. . t. , 1 ⇤
3:
4:
2:
3: Train
forCompute
t
Train= 1,model
model X. the
, T h
h on
weighted X, y with
training instance
t on X, y with instance weights error weights
rate 1 of n1 w h w :
t t n
1:
4:
2: Initialize
Compute
forCompute
t = 1, a . . vector
. the
, T t of n
weighted uniform training weights
error w rate 1 = of h , .
t :. t. , 1
4:
3:
4: Train
✏=t = model X the hw weighted
on X, y training
t t,i with instance error rate
weights of nhw t :: t n
2:
3: forCompute
t
Train 1, .
model. X. the
, T h weighted
on X, y training
with error
instance rate
weights of h t
w
4: ✏t =i:y
Compute i 6= X the
ht (xi ) wweighted
t t,i training error rate of h t: t
3:
4: ✏✏t =
Train
Compute model X hw on X, y training with instance weights of hw
hthe weighted error rate t: t
= t t,i
t i:yi 6= X t (xiw ) t,i
4: ✏
Compute =
t i:yi 6=hthe w
t (xiweighted
) t,i ⇣ ⌘
training error rate of ht :
5: ✏
Choose t =i:yi 6=
i:yi 6=tht=
X ht (xi1w ) t,i ⇣ 1 ✏t ⌘
(xi2) ln
5: ✏
Choose t = X = (xi1w t,i ⇣
) ln 1 ✏t✏t ⌘
6:
i:y
✏t =i:yiall
Update
6
=
i t t h ⇣
1wt,i ⇣ 1 ✏weights:
instance
2 t✏t ⌘

5:
5: Choose
Choose =
6=tht (xi2)1 ln 1 ✏
t = ln t
6: Update all instance
1 ⇣ 1 ✏✏weights:
t✏t ⌘
5:
6: Choose
Update i:yi 6=tht (xi1
all = 2
2)
instanceln t
5:
6: Choose
Update
w all
= w=instanceln
exp ⇣(1 ✏weights:
t✏t ⌘
weights: 8i = 1, . . . , n
t✏tt yi ht (xi ))
t
t+1,i all instance 2
t,i1 ✏
6:
5: Update
Choose = ln ⇣ 1 weights: ⌘
6: wt+1,i all
Update =t winstance 2 exp (1 ✏weights:
t,i1 t✏tt yi ht (xi )) 8i = 1, . . . , n
5:
6:
7:
Choose
w
Update
w t+1,i all
Normalize
w t+1,i
=
=
= w
w
=
winstance
t,i
t+1
ln
t wt,i2 exp ( ✏t t yi ht (x
exp
exp to (
( beweights:
tay
y h (x• i ))
))
distribution:
i h t (x i ))
β t   m
8i
8i
8i
easures  
=
=
=
1,
1,
1,
.. .. .. ,, n
. . . , n
n
the  importance  of  ht!
t+1,i all instance
6:
7: Update
Normalize
w t+1,i = wwt,it+1
t,i exp to( be t i t i
weights: ht (x• i ))If    8i
a distribution:
tayidistribution:  
✏    t  =
 
      
1,    
0.5  
.  .    .  ,,  nthen          t            0
               
7: Normalize w t+1 w to be
7: w
Normalize =w w exp to ( be tayidistribution:
h8i
t (x= 1,o . .8i
i )) = 1, . . . , n
t+1,i
P Trivial,   = 1, . .o . ,therwise   flip  ht‘s  predicIons  
7: wt+1,i
t+1,i = w
Normalize t,i
t+1 to
nwt+1,i be a distribution: .,n
7: wwt+1,i
Normalize = wP t+1 exp
t,ij=1 w(t+1,j tayidistribution:
h8i
t (x= i )) 8i n
t+1,i = w nw
t+1 to
t+1,i
w to be 1, . . . , n
7: w t+1,i =
Normalize
w = wP
P nwt+1,i
j=1
t+1 wt+1,j 8i
be a distribution:
8i • 
=
= β
1,
1, ..  .. grows  
.. ,, n n as  error  ht‘s  shrinks  
7: w t+1,i = P n t+1,i
wt+1,iw t+1,j 8i = 1, t
. .   . , n
8: endNormalize
for t+1,i
wt+1,i = Pt+1
w nj=1j=1 tow
w
be a distribution:
t+1,j
8i = 1, . . . , n
8: end for n wt+1,i
j=1
w
t+1,j
9:
8: Return
end the
wt+1,i = j=1
for hypothesis
P nwt+1,i t+1,j 8i = 1, . . . , n
8:
9: end
Return for the hypothesis
wt+1,i = Pnj=1 t+1,j 8i = w 1, . . . , n
8:
9: end
Return for the hypothesis w T
!
9:
8:
9: Return
end
Return for thethe hypothesis
j=1 X
hypothesis
t+1,j
!
8:
9: end for H(x)
Return the hypothesis = sign XT
T
h
t t (x) !
!
8:
9: end
Return for H(x)
the = sign X
hypothesis XT
t=1 t h t (x) ! 51  
T
INPUT: training data X, y = {(xi , yi )}ni=1 ,
INPUT: the training number dataofX, y = {(xiT, yi )}ni=1
iterations n , ⇥

AdaBoost
INPUT: training data X, y = {(x , y
i , yi )}n)} ,, 1 ⇤
1: INPUT:
Initialize
INPUT: the a training
thevector
training number ofdata
n of X,
uniform
dataofX, y =
iterations {(x
weights
iT
y = {(xiT, yi )}i=1 i w
i=1
n =
1 , ⇥n
i=1 , . . . , 1

number iterations   n
1:
2: Initialize
INPUT:
for t = 1, a. the vector
training
.
the . , Tnumber
number ofdata of
n uniform
of X, iterations
iterations {(xiT
y = weights T, yi )} n
wi=1
1 = ⇥ 1
, ⇥ n1 , . . . , n1 ⇤⇤ 1
1:
1: Initialize
INPUT:
Initialize aa training
vector
vector of n
  ofdata
n uniform
X,
uniform y = weights
{(xiT, yi )}
weights w
w
n1 =
= , ⇥ n11 ,, .. .. .. ,, n11 ⇤
2:
3:
1:
2:
for t
Train=
Initialize
INPUT:
for t =
1,
1,
.
model.
the . ,
a. training
. vector
. ,
T
T
number
h on
t ofdata X,of
n uniform y
X,
iterations
with instance
y = weights
{(xiT , yi )}
i=1
weights
1
wi=1
n1 =, ⇥ nn1 , w . . t. , nn1will  
⇤ be  ≤  1  
2:
3:
1:
4:
2:
3:
for
for t
Train=
Initialize
Compute
t
Train= 1,
1, .
model
model
the
.
the
.
. ,
a. .vector
the
, T
T
number
number
h
This  
hweighted onnX,
t of
on
iuniform
X,
s  yytiterations
of
of
he  
with same  
iterations
training
with
instance
weights
error
T
instance
as:   1 =⇢
weights
wrate
weights
⇥ n1h, w
of ..., ⇤
t: t n
w t. , 1 t
tet n ⇤ if h (x ) = y
1: Initialize a. .vector t of n uniform weights w = ⇥ , . .
3:
4:
2:
3: Train
forCompute
t
Train= 1,model
model X. the
, T h
h  
weightedon X, y with
training
t on X, y with instance weightsinstance
error weights
rate
1 of n
1 h w
w : 1 t i i
1:
4:
2:
4:
3:
Initialize
forCompute
t =
Compute
Train
✏=t =1, a. .vector
model X. the
,
theT hw
t of n uniform
weighted
weighted w
on X,t+1,i
training
training
y training
weights
=
with instance w
error
error t,irate ⇥
wrate
1 = ofnh, t. :. t. , n
rate
weightsof h w :
 
t: t t
4:
2:
3:
4:
forCompute
t
Train 1,
✏t =i:y
Compute
.
model.
i 6=
X
X
. the
,
the
T h
ht (xi ) w
t t,i
weighted
weighted
t on
t,i
X, y with
training
error
instance
error weights
rate
of
of
h
h
te
w
t: t
if ht (xi ) 6= yi
3: ✏✏t =
Train model X hw on X, y training with instance weights of hw
4: Compute hthe )   t,i
weighted error rate t: t will  be  ≥  1  
= t t,i
t i:yi 6= X t (xiw
4: ✏
Compute =
t i:yi 6=hthe w
t (xiweighted
) t,i ⇣ ⌘
training error rate of h t :
(xi2  ) ln
5: ✏t =i:y
Choose i 6=X
i:yi 6=tht=
ht (xi1w ) t,i ⇣ 1 ✏t ⌘
5: ✏t =i:yi 6=
Choose X = 1w t,i ⇣
ln 1 ✏t✏t ⌘
h (x ) ⇣ ⌘
6:
5:
5:
✏t =i:yiall
Update
Choose
Choose
t
=
6=tht (xi2)
=
i
t instance
1 EssenIally  
1wt,i ⇣ 1 ✏weights:
2 ln
ln 1 t✏t ⌘

✏t✏tt ⌘
this  emphasizes  misclassified  instances.  
6:
5: Update
Choose all t
i:yi 6=tht= instance
1
2) ln ⇣ 1 weights:
2  
(xi1 ✏t
6:
5:
6: Update
Choose
Update all
all =instance
instance ln ⇣ 1 ✏weights:
t✏t ⌘
weights:
6: w
Update = t w
t+1,i all instance 2
t,i1 exp ⇣ ( ✏ t✏tt yi ht (xi ))
1 weights: ⌘ 8i = 1, . . . , n
5:
6:
Choose
wt+1,i all
Update
=
=t winstance2  exp (1 ✏weights:
t,i1
ln t✏tt yi ht (xi )) 8i = 1, . . . , n
5: Choose
w t+1,i all = = ln
t wt,i2 exp ( ✏t t yi ht (xi )) 8i = 1, .. .. .. ,, n
6: Update
w = w instance exp ( weights: y h (x )) 8i = 1, n
7:
6:
7:
Normalize
w
Update
t+1,i
t+1,i
Normalize
=
all ww
w
t,i
t+1
t,i  
instance exp to
to
( be
weights:
be
t
t
a
a
y distribution:
i
i h t
t (x i
i ))
distribution:
8i = 1, . . . , n
7: w t+1,i
Normalize = ww
t+1 exp ( y
t,i   to be ta idistribution: h (x
t i )) 8i = 1, . . . , n
t+1 w
7: w
Normalize =w w exp to ( be tayidistribution:
h8i
t (x= 1, . .8i
i )) = 1, . . . , n
t+1,i
7: wt+1,i
t+1,i = w
Normalize P t,i
t+1
n w to be a distribution: .,n
7: wwt+1,i
Normalize = wP t+1
t,ij=1 exp w(t+1,j
t+1,i tayidistribution:
h8i
t (x=i )) 8i = 1, . . . , n
t+1,i = w nw
t+1 to
t+1,i
w to be 1, . . . , n
7: w t+1,i =
Normalize wP
P nwt+1,i wt+1,j 8i
be a distribution: = 1, .. .. .. ,, n
w
w t+1,i =
= P
j=1
t+1
n t+1,i
w t+1,j
8i
8i =
= 1,
1, . . . , n
n
7:
8: Normalize
end for t+1,i w j=1
n
t+1 w tow
t+1,i be
t+1,j a distribution:
wt+1,i = nj=1 P j=1
wt+1,iw t+1,j 8i = 1, . . . , n
8:
9: end
Return for the
wt+1,i = j=1 hypothesis
P w t+1,j 8i = 1, . . . , n
8: end for nwt+1,i
8:
9: end
Return for the hypothesis
wt+1,i = Pnj=1 t+1,j 8i = w 1, . . . , n
8:
9: end
Return for the hypothesis w T
!
9:
8:
9: Return
end
Return for thethe hypothesis
j=1 X
hypothesis
t+1,j
!
8:
9: end for H(x)
Return the hypothesis = sign X T
T
h
t t (x) !
!
8:
9: end
Return for H(x)
the = sign X
hypothesis X T
t=1 t h t (x) ! 52  
T
INPUT: training data X, y = {(xi , yi )}ni=1 ,
INPUT: the training number dataofX, y = {(xiT, yi )}ni=1
iterations n , ⇥

AdaBoost
INPUT: training data X, y = {(x , y
i , yi )}n)} ,, 1 ⇤
1: INPUT:
Initialize
INPUT: the a training
the vector
training number ofdata
n of X,
uniform
dataofX, y =
iterations {(x
weights
iT
y = {(xiT, yi )}i=1 i w
i=1
n =
1 , ⇥n
i=1 , . . . , 1

number iterations   n
1:
2: Initialize
INPUT:
for t = 1, a. the vector
training
.
the . , Tnumber
number ofdata of
n uniform
of X, iterations
iterations {(xiT
y = weights T, yi )} n
wi=1
1 = ⇥ 1
, ⇥ n1 , . . . , n1 ⇤⇤ 1
1:
1: Initialize
INPUT:
Initialize aa training
vector
vector of
of n
data
n uniform
X,
uniform y = weights
{(xiT, yi )}
weights w
w
n1 =
= , ⇥ n11 ,, .. .. .. ,, n11 ⇤
2:
3: for t
Train= 1, .
model.
the . , Tnumber
h on X,of y iterations
with instance i=1
weights
, ⇥ nn1 , w
1: Initialize a. training
vector t ofdata
n uniform 1 . . t. , nn1 ⇤
2: INPUT:
for t = 1, .
the . , Tnumber of X, y = weights
iterations {(xiT , yi )} wi=1
n1 =
2:
3:
1:
4:
2: for t
Train=
Initialize
forCompute
t = 1,
1, .
model. .
a. .vector
. ,
the
, T
T hweighted
t ofonnX, y training
uniform with weights
instance
error weights
wrate
1 = of ⇥ n1h, w ..., ⇤
t: t n
3:
1: Train
Initialize modelthe
a. .vector number
h t on
of n X,of y
uniform iterations
with weightsT
instance w weights
= ⇥ , w
. . t. , 1 ⇤
3:
4:
2:
3: Train
forCompute
t
Train= 1,model
model X. the
, T h
h on
weighted X, y with
training instance
t on X, y with instance weights error weights
rate
1 of n
1 wh w :
t t n
1:
4:
2: Initialize
Compute
forCompute
t = 1, a . . vector
. the
, T t of n
weighted uniform training weights
error w =
rate
1 of h, .
t .
: t. , 1
4:
3:
4: Train
✏=t = model X the hw weighted
on X, y training
t t,i with instance error rate
weightsof nhw t :: t n
2:
3: forCompute
t
Train 1, .
model. X. the
, T h weighted
on X, y training
with error
instance rate
weightsof h t
w
4: ✏t =i:y
Compute i 6= X the
ht (xi ) wweighted
t t,i training error rate of h t: t
3:
4: ✏✏t =
Train
Compute model X hw on X, y training with instance weights of hw
hthe weighted error rate t: t
= t t,i
t i:yi 6= X t (xiw ) t,i
4: ✏
Compute =
t i:yi 6=hthe w
t (xiweighted
) t,i ⇣ ⌘
training error rate of ht :
5: ✏
Choose t =i:yi 6=
i:yi 6=tht=
X ht (xi1w ) t,i ⇣ 1 ✏t ⌘
(xi2) ln
5: ✏
Choose t = X = (xi1w t,i ⇣
) ln 1 ✏t✏t ⌘
6:
i:y
✏t =i:yiall
Update
6
=
i t t h ⇣
1wt,i ⇣ 1 ✏weights:
instance
2 t✏t ⌘

5:
5: Choose
Choose =
6=tht (xi2)1 ln 1 ✏
t = ln t
6: Update all instance
1 ⇣ 1 ✏✏weights:
t✏t ⌘
5: Choose = 2 ln
i:yi 6=tht (xi1 2) Make  wt+1  sum  to  1  
t
6:
5:
6: Update
Choose
Update all
all =instance
instanceln ⇣(1 ✏weights:
t✏t ⌘
weights:
6: w t+1,i
Update all = t w 2
t,i exp
instance ✏ t y i h t (x i )) 8i = 1, . . . , n
5: Choose
wt+1,i all =t w = 1
ln ⇣(1 ✏weights:
2 exp
1 t✏ t⌘
t✏tt yi ht (xi )) 8i = 1, . . . , n  
6:
5: Update
Choose
w = instance
t,i1
=t,i2 exp weights:
ln ( ✏t t yi ht (xi )) 8i = 1, . . . , n
t+1,i all t w
6:
7: Update
w
Normalize
t+1,i = wwinstance
t,i exp to(( be weights:
tay h t (x i ))
idistribution: 8i = 1, .. .. .. ,, n
6: w
Update t+1,i =
all w t+1
t,i exp
instance t y
weights: i h t (x i )) 8i = 1, n
7: Normalize
w t+1,i = ww t+1
t,i exp to ( be ay distribution:
h (x ))
ta idistribution:
t i 8i = 1, . . . , n
7: Normalize w t+1 w to be
7: w
Normalize =w w exp to ( be tayidistribution:
h8i
t (x= 1, . .8i
i )) = 1, . . . , n
t+1,i
7: wt+1,i
t+1,i = w
Normalize P t,i
t+1
n w to be a distribution: .,n
7: wwt+1,i
Normalize = wP t+1 exp
t,ij=1 w(t+1,j
t+1,i tayidistribution:
h8i
t (x=i )) 8i = 1, . . . , n
t+1,i = w nw
t+1 to
t+1,i
w to be 1, . . . , n
7: w t+1,i =
Normalize wP
P nwt+1,i wt+1,j 8i
be a distribution: = 1, .. .. .. ,, n
w
w t+1,i =
= P
j=1
t+1
n t+1,i
w t+1,j
8i
8i =
= 1,
1, . . . , n
n
7:
8: Normalize
end for t+1,i w w
j=1
n
t+1 tow
t+1,i be
t+1,j a distribution:
wt+1,i = nj=1 P j=1
wt+1,iw t+1,j 8i = 1, . . . , n
8:
9: end
Return for the
wt+1,i = j=1 hypothesis
P w t+1,j 8i = 1, . . . , n
8: end for nwt+1,i
8:
9: end
Return for the hypothesis
wt+1,i = Pnj=1 t+1,j 8i = w 1, . . . , n
8:
9: end
Return for the hypothesis w T
!
9:
8:
9: Return
end
Return for thethe hypothesis
j=1 X
hypothesis
t+1,j
!
8:
9: end for H(x)
Return the hypothesis = sign XT
T
h
t t (x) !
!
8:
9: end
Return for H(x)
the = sign X
hypothesis XT
t=1 t h t (x) ! 53  
T
INPUT: training data X, y = {(xi , yi )}ni=1 ,
INPUT: the training number dataofX, y = {(xiT, yi )}ni=1
iterations n , ⇥

AdaBoost
INPUT: training data X, y = {(x , y
i , yi )}n)} ,, 1 ⇤
1: INPUT:
Initialize
INPUT: the a training
the vector
training number ofdata
n of X,
uniform
dataofX, y =
iterations {(x
weights
iT
y = {(xiT, yi )}i=1 i w
i=1
n =
1 , ⇥n
i=1 , . . . , 1

number iterations   n
1:
2: Initialize
INPUT:
for t = 1, a. the vector
training
.
the . , Tnumber
number ofdata of
n uniform
of X, iterations
iterations {(xiT
y = weights T, yi )} n
wi=1
1 = ⇥ 1
, ⇥ n1 , . . . , n1 ⇤⇤ 1
1:
1: Initialize
INPUT:
Initialize aa training
vector
vector of
of n
data
n uniform
X,
uniform y = weights
{(xiT, yi )}
weights w
w
n1 =
= , ⇥ n11 ,, .. .. .. ,, n11 ⇤
2:
3: for t
Train= 1, .
model.
the . , Tnumber
h on X,of y iterations
with instance i=1
weights
, ⇥ nn1 , w
1: Initialize a. training
vector t ofdata
n uniform 1 . . t. , nn1 ⇤
2: INPUT:
for t = 1, .
the . , Tnumber of X, y = weights
iterations {(xiT , yi )} wi=1
n1 =
2:
3:
1:
4:
2: for t
Train=
Initialize
forCompute
t = 1,
1, .
model. .
a. .vector
. ,
the
, T
T hweighted
t ofonnX, y training
uniform with weights
instance
error weights
wrate
1 = of ⇥ n1h, w ..., ⇤
t: t n
3:
1: Train
Initialize modelthe
a. .vector number
h t on
of n X,of y
uniform iterations
with weightsT
instance w weights
= ⇥ , w
. . t. , 1 ⇤
3:
4:
2:
3: Train
forCompute
t
Train= 1,model
model X. the
, T h
h on
weighted X, y with
training instance
t on X, y with instance weights error weights
rate
1 of n h
1 w w :
t t n
1:
4:
2: Initialize
Compute
forCompute
t = 1, a . . vector
. the
, T t of n
weighted uniform training weights
error w =
rate
1 of ,
h .
t .
: t. , 1
4:
3:
4: Train
✏=t = model X the hw weighted
on X, y training
t t,i with instance error rate
weightsof nhw t :: t n
2:
3: forCompute
t
Train 1, .
model. X. the
, T h weighted
on X, y training
with error
instance rate
weightsof h t
w
4: ✏t =i:y
Compute i 6= X the
ht (xi ) wweighted
t t,i training error rate of h t: t
3:
4: ✏✏t =
Train
Compute model X hw on X, y training with instance weights of hw
hthe weighted error rate t: t
= t t,i
t i:yi 6= X t (xiw ) t,i
4: ✏
Compute =
t i:yi 6=hthe w
t (xiweighted
) t,i ⇣ ⌘
training error rate of ht :
5: ✏
Choose t =i:yi 6=
i:yi 6=tht=
X ht (xi1w ) t,i ⇣ 1 ✏t ⌘
(xi2) ln
5: ✏
Choose t = X = (xi1w t,i ⇣
) ln 1 ✏t✏t ⌘
6:
i:y
✏t =i:yiall
Update
6
=
i t t h ⇣
1wt,i ⇣ 1 ✏weights:
instance
2 t✏t ⌘

5:
5: Choose
Choose =
6=tht (xi2)1 ln 1 ✏
t = ln t
6: Update all instance
1 ⇣ 1 ✏✏weights:
t✏t ⌘
5:
6: Choose
Update i:yi 6=tht (xi1
all = 2
2)
instanceln t
5:
6: Choose
Update
w all
= w=instanceln
exp ⇣(1 ✏weights:
t✏t ⌘
weights: 8i = 1, . . . , n
t✏tt yi ht (xi ))
t
t+1,i all instance 2
t,i1 ✏
6:
5: Update
Choose = ln ⇣ 1 weights: ⌘
6: wt+1,i all
Update =t winstance 2 exp (1 ✏weights:
t,i1 t✏tt yi ht (xi )) 8i = 1, . . . , n
5: Choose
w t+1,i all = = ln
t wt,i2 exp ( ✏t t yi ht (xi )) 8i = 1, .. .. .. ,, n
6:
7:
6:
7:
Update
w
Normalize
w
Update
t+1,i
t+1,i
Normalize
=
=
all
w
ww
w
instance
t,i
t+1
t,i
exp
exp
instance
to
to
(
( be
be
weights:
t
t
ay
y
weights:
a i
h
h t
(x
(x i
))
distribution:
i t i ))
distribution:
8i
8i =
= 1,
1, . Member  
. . , n
n classifiers  with  less  
w = w t+1 exp ( y h (x )) 8i = 1, . . . , n
= 1, . .error  
. , n are  given  more  weight  in  
7: t+1,i
Normalize w t,i to be ta idistribution:
t i
7: w
Normalize =w w t+1 w
exp to
t+1,i( be tayidistribution:
h8i
t (x=i )) 8i
7: wt+1,i
t+1,i = w
Normalize P t,i
t+1
n w to be a 1,
distribution: . . . , n
w = wP t+1 exp w(t+1,j tayidistribution:
h8i
t (x=i )) 8i = 1, . . . , n
7: wt+1,i
Normalize
w
t+1,i = w
= P
t,ij=1
nw
t+1 w
t+1,i
to
t+1,i
w
t+1,i
be
8i =
1,
1,
.
.
.
.
.
.
,
,
n
n
the  final  ensemble  hypothesis  
7: Normalize
w t+1,i
t+1,i =
wP n
j=1
t+1 to
wt+1,iw be a distribution:
t+1,j 8i = 1, .. .. .. ,, n  
7:
8: end forw
Normalize
t+1,i = wP n
w
j=1
n tow
t+1,i
t+1,j
be a 8i = 1,
distribution: n
Final  predicIon  is  a  weighted  
t+1
wt+1,i = Pj=1 n w
j=1 wt+1,j
t+1,j 8i = 1, . . . , n
8:
9: end
Return for
w the = hypothesis
P t+1,i
w t+1,j 8i = 1, . . . , n
8: end for t+1,i nwt+1,i
j=1
8:
9:
8:
9:
end
Return
end
Return
for
w
for the
t+1,i
the
= j=1 wt+1,j
hypothesis
P n
hypothesis
8i = ! 1, . . . , n combinaIon  of  each  
w
member’s  predicIon  
9:
8: Return
end for the hypothesis
j=1 X Tt+1,j
9: Return the hypothesis !
8:
9: end for H(x)
Return the hypothesis = sign XT
T t h t (x) !
!  
8:
9: end
Return for H(x)
the = sign X
hypothesis XT
t=1
T t h t (x) ! 54  
Dynamic  Behavior  of  AdaBoost  
•  If  a  point  is  repeatedly  misclassified...  
–  Each  Ime,  its  weight  is  increased  
–  Eventually  it  will  be  emphasized  enough  to  
generate  a  hypothesis  that  correctly  predicts  it  

•  Successive  member  hypotheses  focus  on  the  


hardest  parts  of  the  instance  space  
–  Instances  with  highest  weight  are  owen  outliers  

55  
AdaBoost  and  Overfijng  
•  VC  Theory  originally  predicted  that  AdaBoost  would  
always  overfit  as  T  grew  large  
–  Hypothesis  keeps  growing  more  complex  

•  In  pracIce,  AdaBoost  owen  did  not  overfit,  


contradicIng  VC  theory  

•  Also,  AdaBoost  does  not  explicitly  regularize  the  model  

56  
Explaining  Why  AdaBoost  Works  
AdaBoost  on  OCR  data  with  
Test   C4.5  as  the  base  learner  

Train  

•  Empirically,  boosIng  resists  overfijng  


•  Note  that  it  conInues  to  drive  down  the  test  error  
even  AFTER  the  training  error  reaches  zero  

[Figure  from  Schapire:  “Explaining  AdaBoost”]     57  


Explaining  Why  AdaBoost  Works  
T  =  5  
AdaBoost  on  OCR  data  with   T  =  100  
Test   C4.5  as  the  base  learner   T  =  1000  

Train  

•  The  “margins  explanaIon”  shows  that  boosIng  tries  


to  increase  the  confidence  in  its  predicIons  over  Ime  
–  Improves  generalizaIon  performance  
–  EffecIvely,  boosIng  maximizes  the  margin!  

[Figures  from  Schapire:  “Explaining  AdaBoost”]     58  


AdaBoost  in  PracIce  
Strengths:  
•  Fast  and  simple  to  program  
•  No  parameters  to  tune  (besides  T)  
•  No  assumpIons  on  weak  learner  

When  boosIng  can  fail:  


•  Given  insufficient  data  
•  Overly  complex  weak  hypotheses  
•  Can  be  suscepIble  to  noise  
•  When  there  are  a  large  number  of  outliers  
59  
Boosted  Decision  Trees  
Error  Rates  on  27  
•  Boosted  decision  trees  are  one  of   Benchmark  Data  Sets  
the  best  “off-­‐the-­‐shelf”  classifiers  
–  i.e.,  no  parameter  tuning  

•  Limit  member  hypothesis  


complexity  by  limiIng  tree  depth  
•  Gradient  boosIng  methods  are  
typically  used  with  trees  in  
pracIce  

“AdaBoost  with  trees  is  the  best  off-­‐the-­‐shelf  classifier  in  the  world”  -­‐Breiman,  1996  
(Also,  see  results  by  Caruana  &  Niculescu-­‐Mizil,  ICML  2006)  
[Figure  from  Freud  and  Schapire:  “A  Short  IntroducIon  to  BoosIng”]     60  

You might also like