Professional Documents
Culture Documents
23 Ens RandomForests
23 Ens RandomForests
Emilie Chautru
Geosciences and Geoengineering Department
Geostatistics team
Back to the beginning
Colour
red yellow
Size Shape
round pointy
Sherry Strawberry
Red
yes no
<5cm Curved
yes no yes no
yes no yes no
𝑥2 < 𝑠22 ℛ2 ℛ4
𝑠22
yes no
ℛ3 ℛ5 𝑠11 𝑠12
𝑅
∑
∀𝑥 ∈ 𝒳 𝑥) =
𝑔(𝑥 𝟏𝑥 ∈ ℛ𝑟 𝑦(ℛ𝑟 )
𝑟=1
𝑥 ∈ 𝒳 ∣ 𝑥𝑗 ∈ 𝒮}
ℒ𝑗,𝒮 = {𝑥 and 𝑥 ∈ 𝒳 ∣ 𝑥𝑗 ∉ 𝒮}
ℛ𝑗,𝒮 = {𝑥
𝑛
∑ 𝑛
∑
( 𝑖 ( ))2 ( 𝑖 ( ))2
𝐶(𝑗, 𝒮) = 𝑦 − 𝑦 ℒ𝑗,𝒮 𝟏𝑥 𝑖 ∈ ℒ𝑗,𝒮 + 𝑦 − 𝑦 ℛ𝑗,𝒮 𝟏𝑥 𝑖 ∈ ℛ𝑗,𝒮
𝑖=1 𝑖=1
( ) ( )
𝑛 ℒ𝑗,𝒮 ( ) 𝑛 ℛ𝑗,𝒮 ( )
𝐶(𝑗, 𝒮) = 𝐼 ℒ𝑗,𝒮 + 𝐼 ℛ𝑗,𝒮
𝑛 𝑛
6e+05
5e+05
mse
4e+05
3e+05
2e+05
180 200 220
6000
5000
Body mass (g)
4000
3000
250000
mse
200000
150000
6000
5000
Body mass (g)
4000
3000
60
50
Bill length (mm)
Species
Adelie
Chinstrap
Gentoo
40
60
50
Bill length (mm)
Species
Adelie
Chinstrap
Gentoo
40
flipper_length_mm
< 207
>= 207
bill_length_mm
< 45
>= 45
Assets
allows any type (continuous/categorical) of features and label
adapted to multi-class problems
adapted to multi-modal data
easy to implement, visualise and interpret
Liabilities
weak learners (too simple)
non-robust to data variations
Ensemble learning
Bagging the errors of many independent weak learners can balance each
other out
∑𝑀
return the final decision function: 𝑔 ∶ 𝑥 ∈ 𝒳 ↦ 𝑚=1
𝑥)
𝛼𝑚 𝑔𝑚 (𝑥
Chloe-Agathe. Azencott.
Introduction au Machine Learning.
Collection InfoSup, Dunod, 2022.
Trevor Hastie, Robert Tibshirani, Jerome Friedman.
The Elements of Statistical Learning.
Springer Series in Statistics, 2009.