Professional Documents
Culture Documents
ML 15 Random Forests
ML 15 Random Forests
E ( F (Y ) )
= ∑=
F ( y ) Pr (Y i yi )
all i
or
E ( F (Y ) ) = ∫ F ( y ) p ( y ) dy
+∞
−∞
(Y ) E (Y − µY )
2
Var
=
(Y ) E (Y − µY )
2
Note that Var
=
= E Y 2 − 2 µY Y + µY 2
=E Y 2 − 2 E [ µY Y ] + E µY 2
=E Y 2 − 2 µY E [Y ] + µY 2 E [1]
= E Y 2 − 2 µY 2 + µY 2
= E Y 2 − µY 2
= E Y − ( E [Y ]) 2 2
σ Y = Var (Y )
Remembering that µ X = E [ X ]
ρ X ,Y = Corr ( X , Y )
Cov( X , Y )
=
σ XσY
E [ XY ] − µ X µY
=
σ XσY
{ }
Let Sˆ = Sˆ1 , Sˆ2 ,..., SˆK , ,
1
∑Var ( F(Sˆ ) ).
K
Let σˆ =2
F k
K k
1 K K
( 1
) ∑∑ Cov ( F(Sˆ ), F(Sˆ ) )
K K
ˆ ˆ
2 ∑∑ Cov F( S j ), F( S k )
K j k K2
j k
Let ρˆ F = j k
1 K 1 K σˆ F 2
K ∑
k
Var F( ˆ
S k (
) K ∑
)
k
Var F( ˆ )
S k
( )
1 K K
( ) ( )
K
ˆ ˆ ˆ ˆ
2 ∑ ∑
= Cov F( Si ), F( S j ) + ∑ Cov F( Si ), F( S j )
K=i 1 =
j i j ≠i
1 K
( ) ( )
K
ˆ ˆ ˆ
2 ∑
= Var F( Si ) + ∑ Cov F( Si ), F( S j )
= K i1 j ≠i
1 K 2 K
2
2 ∑ ∑
= σ ˆ + ( K − 1) σˆ ρˆ
= K i1 j ≠i
1
=
2
K σˆ 2
+ ( K − 1)σˆ 2
ρˆ
K
σˆ 2 ( K − 1) 2
= + σˆ ρˆ
K K
The City College of New York
CSc 59929 – Introduction to Machine Learning 13
Spring 2020 – Erik K. Grimmelmann, Ph.D.
Variance of F( Sˆ )
1 K
ˆ= σˆ 2
( K − 1) 2
Var
K
∑i i K
F( S ) +
K
σˆ ρˆ
σˆ 2 1
= + σˆ 2 ρˆ − σˆ 2 ρˆ
K K
1
=σˆ 2 ρˆ + σˆ 2 (1 − ρˆ )
K
decreases if ρˆ decreases decreases as K increases, regardless of ρˆ
0.30
0.29
0.28
0.27
0.26
0.25
0.24
2 6 10 14 18 22 26 30 34 38 42 46 50 54 58 62 66 70 74 78 82 86 90 94 98
N
The City College of New York
CSc 59929 – Introduction to Machine Learning 19
Spring 2020 – Erik K. Grimmelmann, Ph.D.
Out-of-bag sample advantages
The fact any given data point, (xi ,yi ) is missing from a significant
fraction (~ 1/3) of bootstrap samples, Sˆ , drawn from even a small
k
( )
Let Fˆoob ( xi ) = MajorityVoting Fˆoob ,1 ( xi ) , Fˆoob ,2 ( xi ) ,..., Fˆoob , K ( xi ) ,
Fk ( xi ) if xi ∉ Sˆk
where Fˆoob ,k ( xi ) =
{ } if xi ∈ Sˆk
The out-of-bag error is given by
ˆ 1 N
E Foob ( x ) =
N
∑1
i =1
Fˆoob ( xi ) ≠ yi
,
Sample ID 1 2 3 4 … N
Feature 1 F1(1) F1(2) F1(3) F1(4) … F1(N)
Feature 2 F2(1) F2(2) F2(3) F2(4) … F2(N)
⁞ ⁞ ⁞ ⁞ ⁞ ⁞
Feature M FM(1) FM(2) FM(3) FM(4) … FM(N)
Class ⓪ ❶ ❶ ⓪ … ❶
1 2 3 4 … N
⓪❶❶⓪ … ❶
omitted data
1 2 2 4 … 3 …
bootstrap samples S1
⓪❶❶⓪ … ❶ ❶ …
1 1 3 4 … 2 …
S2 ⓪⓪❶⓪ … ⓪ ❶ …
1 1 2 4 … 3 …
S3 ⓪⓪❶⓪ … ❶ ❶ …
2 3 3 4 … 1 …
S4 ❶ ❶❶ ⓪ … ❶ ⓪ …
⁞ ⁞
1 1 3 3 … 2 4 …
SK
⓪ ⓪❶ ❶ … ⓪ ❶ ⓪ …
1 1 3 4 …
S2 T2
⓪⓪❶⓪ … ⓪ ⓪ ⓪⓪ ❶
1 1 2 4 …
S3 T3
⓪⓪❶⓪ … ❶ ⓪ ⓪❶ ❶
2 3 3 4 …
S4 ❶ ❶❶ ⓪ … ❶ ❶ ⓪❶ ❶
T4
⁞ ⁞
1 1 3 3 …
SK TK
⓪ ⓪❶ ❶ … ⓪ ❶ ⓪⓪ ❶
T1 C1=❶
⓪ ❶❶ ❶
T2 C2=⓪
⓪ ⓪⓪ ❶
Sample ID j
Feature 1 F1(j)
Feature 2 F2(j) ⓪ ⓪❶ ❶
T3 C3=❶
⁞ ⁞ C =❶
Feature M FM(j)
❶ ⓪❶ ❶
T4 C4=❶
Class ?
⁞ ⁞
❶ ⓪⓪ ❶
TK CK=⓪
Sample ID 1 2 … N Sample ID 1 2 … N
⁞ ⁞ ⁞ ⁞ ⁞ ⁞ ⁞ ⁞
⓪ ⓪❶ ❶
T3 Feature M FM(1) FM(2) … FM(N)
Feature M FM(1) FM(2) … FM(N)
Class ⓪ ❶ … ❶ Class ⓪ ❶ … ❶
❶ ⓪❶ ❶
T4 Predicted
⓪ ❶ … ⓪
Class
⁞ Error 0 0 … 1
TK 1 N
❶ ⓪⓪ ❶ Errortraining =
N
∑ Error ( )
i =1
i
1 1 2 4 … contain xi to
S3 T3….
⓪⓪❶⓪ … ❶
obtain Fˆoob ,k ( xi ) .
⓪ ⓪❶ ❶
…
2 3 3 4 …
S4 T1,…, 1 N
∑1
⓪ ❶❶ ❶
❶ ❶❶ ⓪ … ❶
… Erroroob = Fˆoob ( xi ) ≠ yi
⁞ ⁞ N i =1
1 1 3 3 …
SK T2,T4,…
⓪ ⓪❶ ❶ … ⓪ ⓪ ⓪⓪ ❶ ❶ ⓪❶ ❶