Professional Documents
Culture Documents
Main EL CM1 2 2023
Main EL CM1 2 2023
Main EL CM1 2 2023
Lecture 1:
Part 1: Introduction - Part 2: Decision Trees
Myriam Tami
myriam.tami@centralesupelec.fr
About me
Undergrad at the University of Montpellier (UM), France
Ph.D. in IMAG (Institut Montpelliérain Alexander Grothendieck) lab
at UM, France
Postdoc researcher in LIG (Laboratoire Informatique de Grenoble)
lab at University of Grenoble Alpes, France
Associate Professor Maître de ConFérences (MCF) at
CentraleSupélec & MICS (Mathematics Interacting with Computer
Science) lab since February 2019
Research interests: ML, Ensemble methods, Few-supervised ML
methods, Domain adaptation, Heterogeneous data, Multi-modal
learning, Causality.
Contact:
Email: myriam.tami@centralesupelec.fr
Office: Bouygues, MICS Lab, Room SC.116
M. Tami (CentraleSupélec) Ensemble learning from theory to practice January - March, 2023 2 / 72
Context Basic idea and reminders Decision tree model and building Discussion References
Course organization
M. Tami (CentraleSupélec) Ensemble learning from theory to practice January - March, 2023 3 / 72
Context Basic idea and reminders Decision tree model and building Discussion References
M. Tami (CentraleSupélec) Ensemble learning from theory to practice January - March, 2023 4 / 72
Context Basic idea and reminders Decision tree model and building Discussion References
Teaching tools
Edunao:
I More details about the course and the organization (timeline, etc.)
I Teaching material (slides, Jupyter Notebooks, PDF with practical
exercices, corrections.)
I Resources (books, papers)
I Quiz sessions
I Submit your work
Microsoft Teams:
I To join the team, use the following code: bnxzwpa
I If you are sick (covid or other), please let me know, and we will use
Teams
M. Tami (CentraleSupélec) Ensemble learning from theory to practice January - March, 2023 5 / 72
Context Basic idea and reminders Decision tree model and building Discussion References
M. Tami (CentraleSupélec) Ensemble learning from theory to practice January - March, 2023 6 / 72
Context Basic idea and reminders Decision tree model and building Discussion References
M. Tami (CentraleSupélec) Ensemble learning from theory to practice January - March, 2023 7 / 72
Context Basic idea and reminders Decision tree model and building Discussion References
M. Tami (CentraleSupélec) Ensemble learning from theory to practice January - March, 2023 9 / 72
Context Basic idea and reminders Decision tree model and building Discussion References
The guidelines:
Take inspiration from the previous project or any datachallenge
from the web
Try to propose a project, i.e., an ML task on a reasonable dataset:
a huge dataset could be too hard to work on it (too time
consuming)
The project guidelines stay the same (apply all approaches of the
course, compare and present the results and performances,
conclude, write a report, provide the GIT link of your code...)
M. Tami (CentraleSupélec) Ensemble learning from theory to practice January - March, 2023 11 / 72
Context Basic idea and reminders Decision tree model and building Discussion References
Prerequisites in ML
Given an ML problem,
Decide as what type of ML problem you can formalize it:
I Supervised? Unsupervised?
I Regression? Classification? Clustering?
I Dimensionality reduction?
Describe it formaly in terms of
I A design matrix X (n, d) dim
I A target variable/matrix Y
I A loss function to optimize
Apply the ML pipeline (data preprocessing, model selection,
evaluation, make prediction, etc.)
Programming (Python, scikit-learn, numpy, pandas, scipy, etc)
Deal with real-world data challenges
M. Tami (CentraleSupélec) Ensemble learning from theory to practice January - March, 2023 12 / 72
Context Basic idea and reminders Decision tree model and building Discussion References
M. Tami (CentraleSupélec) Ensemble learning from theory to practice January - March, 2023 13 / 72
Context Basic idea and reminders Decision tree model and building Discussion References
M. Tami (CentraleSupélec) Ensemble learning from theory to practice January - March, 2023 15 / 72
Context Basic idea and reminders Decision tree model and building Discussion References
4 Discussion
M. Tami (CentraleSupélec) Ensemble learning from theory to practice January - March, 2023 16 / 72
Context Basic idea and reminders Decision tree model and building Discussion References
Traditional ML process,
You have a data set (both input and target)
You have to achieve a task (i.e., classification, regression)
To tackle it: you learn a predictor (a predictive model)!
Figure: source: Hands-On Machine Learning with Scikit-Learn and TensorFlow, A. Géron
Python library:
scikit-learn
(sklearn module)
M. Tami (CentraleSupélec) Ensemble learning from theory to practice January - March, 2023 19 / 72
Context Basic idea and reminders Decision tree model and building Discussion References
Related goals:
Construct a predictor f̂ (x) of a function f mapping x to y
f: X →Y
x 7→ y
for supervised regression: Y = R and X = Rd
for supervised classification: Y = {1, . . . , C} and X = Rd
where C is the number of classes and d the dimension of the
input space X
M. Tami (CentraleSupélec) Ensemble learning from theory to practice January - March, 2023 20 / 72
Context Basic idea and reminders Decision tree model and building Discussion References
The data
Related problematic:
For a new observation
xnew I want to predict
the linked unobserved
y
M. Tami (CentraleSupélec) Ensemble learning from theory to practice January - March, 2023 21 / 72
Context Basic idea and reminders Decision tree model and building Discussion References
M. Tami (CentraleSupélec) Ensemble learning from theory to practice January - March, 2023 22 / 72
Context Basic idea and reminders Decision tree model and building Discussion References
M. Tami (CentraleSupélec) Ensemble learning from theory to practice January - March, 2023 23 / 72
Context Basic idea and reminders Decision tree model and building Discussion References
Supervised learning
Supervised learning
M. Tami (CentraleSupélec) Ensemble learning from theory to practice January - March, 2023 25 / 72
Context Basic idea and reminders Decision tree model and building Discussion References
M. Tami (CentraleSupélec) Ensemble learning from theory to practice January - March, 2023 26 / 72
Context Basic idea and reminders Decision tree model and building Discussion References
Plan
1 Context & motivations
Machine learning field and weak learners
Decision Tree: a supervised weak learner
2 Basic idea and reminders
Example of basic decision tree
Vocabulary and conventions
3 Decision tree model and building
Decision tree model
Building regression trees
Building classification trees
4 Discussion
M. Tami (CentraleSupélec) Ensemble learning from theory to practice January - March, 2023 27 / 72
Context Basic idea and reminders Decision tree model and building Discussion References
M. Tami (CentraleSupélec) Ensemble learning from theory to practice January - March, 2023 28 / 72
Context Basic idea and reminders Decision tree model and building Discussion References
M. Tami (CentraleSupélec) Ensemble learning from theory to practice January - March, 2023 30 / 72
Context Basic idea and reminders Decision tree model and building Discussion References
M. Tami (CentraleSupélec) Ensemble learning from theory to practice January - March, 2023 31 / 72
Context Basic idea and reminders Decision tree model and building Discussion References
M. Tami (CentraleSupélec) Ensemble learning from theory to practice January - March, 2023 32 / 72
Context Basic idea and reminders Decision tree model and building Discussion References
M. Tami (CentraleSupélec) Ensemble learning from theory to practice January - March, 2023 33 / 72
Context Basic idea and reminders Decision tree model and building Discussion References
M. Tami (CentraleSupélec) Ensemble learning from theory to practice January - March, 2023 36 / 72
Context Basic idea and reminders Decision tree model and building Discussion References
1 First, the instance traverses the tree to find the leaf node for it
2 It returns the ratio of training instances of each class c in this node
E.g., xi = (0.45, 0.05): the corresponding leaf node is the depth-3 left
node (R1 )
So, DT should output the probabilities: 0 for the class 1, 5/6=0.83 for
the majority class 2 and 1/6=0.17 for the class 3
Of course, DT predicts the class 2 for xi
M. Tami (CentraleSupélec) Ensemble learning from theory to practice January - March, 2023 40 / 72
Context Basic idea and reminders Decision tree model and building Discussion References
As you can see DTs are intuitive and their decisions are easy to
interpret!
Such models are often called white box models
In contrast, as we will see in next a lecture, Random Forests are
generally considered black box models
Black box models make great predictions, but it is usually hard to
explain in simple terms why the predictions were made
E.g., if a neural network says that a particular person appears on a picture, it is hard to
know what actually contributed to this prediction: did the model recognize that person’s
eyes? Her nose?
Conversely, DTs provide nice and simple prediction rules that can
even be applied manually if need be
M. Tami (CentraleSupélec) Ensemble learning from theory to practice January - March, 2023 41 / 72
Context Basic idea and reminders Decision tree model and building Discussion References
M. Tami (CentraleSupélec) Ensemble learning from theory to practice January - March, 2023 42 / 72
Context Basic idea and reminders Decision tree model and building Discussion References
M. Tami (CentraleSupélec) Ensemble learning from theory to practice January - March, 2023 43 / 72
Context Basic idea and reminders Decision tree model and building Discussion References
Exercise 2
Rewrite f̂ (x) formula with the regions R1 , R2 , R3 , R4 , R5 , R6 ?
Exercise 3
Rewrite f̂ (x) formula with a sum symbol from k = 1 to k = K = 6?
M. Tami (CentraleSupélec) Ensemble learning from theory to practice January - March, 2023 44 / 72
Context Basic idea and reminders Decision tree model and building Discussion References
Elements of intuition:
The set of K hyper-rectangles Rk is determined by a set of K − 1
chosen splits and their order during the recursive binary
segmentation
Each split is determined by a greedy strategy to best discriminate
the observations in its two branches respectively
The fitted value γk assigned to each terminal node Rk is usually
decided by the majority vote (classification task) or the average
(regression task)
Several decision tree models are possible: as much as input
space partitions and set of γk (instability)
M. Tami (CentraleSupélec) Ensemble learning from theory to practice January - March, 2023 46 / 72
Context Basic idea and reminders Decision tree model and building Discussion References
M. Tami (CentraleSupélec) Ensemble learning from theory to practice January - March, 2023 47 / 72
Context Basic idea and reminders Decision tree model and building Discussion References
PK
We wish the best decision tree predictor f̂ (x) = k =1 γ̂k 1{x∈R̂k }
In practice, this is realized by minimizing a given risk function or
impurity measure (this is called the main optimization problem)
Allowing to evaluate the error made by the learning predictor
M. Tami (CentraleSupélec) Ensemble learning from theory to practice January - March, 2023 48 / 72
Context Basic idea and reminders Decision tree model and building Discussion References
Building a regression DT
The minimization of the risk function allows the construction of a
decision tree by determining the best values for the sets of Rk and
γk
Given a learning set D = (xi , yi )16i6n ⊂ X × Y and using the
empirical quadratic risk function, one should solve:
M. Tami (CentraleSupélec) Ensemble learning from theory to practice January - March, 2023 51 / 72
Context Basic idea and reminders Decision tree model and building Discussion References
∀k ∈ {1, . . . , K }, n
1X 2
γ̂k = argmin yi − γk 1{xi ∈Rk } (disjoint regions)
γk n
i=1
∀k ∈ {1, . . . , K }, given xi ∈ Rk , n
1X
γ̂k = argmin (yi − γk )2
γk n
i=1
Then, ∀k ∈ {1, . . . , K }, given xi ∈ Rk ,
n
∂ X
Rn (f ) = 0 ⇔ −2 (yi − γk ) = 0 (formula derivates)
∂γk
i=1 n
1X
⇔ γ̂k = yi given xi ∈ Rk
n
M. Tami (CentraleSupélec) i=1
Ensemble learning from theory to practice January - March, 2023 52 / 72
Context Basic idea and reminders Decision tree model and building Discussion References
M. Tami (CentraleSupélec) Ensemble learning from theory to practice January - March, 2023 53 / 72
Context Basic idea and reminders Decision tree model and building Discussion References
where, for any choice of (j, s), the inner minimization is solved by
γ̂1 = average (yi |xi ∈ R1 (j, s)) and γ̂2 = average (yi |xi ∈ R2 (j, s))
M. Tami (CentraleSupélec) Ensemble learning from theory to practice January - March, 2023 54 / 72
Context Basic idea and reminders Decision tree model and building Discussion References
M. Tami (CentraleSupélec) Ensemble learning from theory to practice January - March, 2023 56 / 72
Context Basic idea and reminders Decision tree model and building Discussion References
M. Tami (CentraleSupélec) Ensemble learning from theory to practice January - March, 2023 57 / 72
Context Basic idea and reminders Decision tree model and building Discussion References
Penalized criterion
The optimal tree is a pruned subtree minimizing the prediction
error penalized by the complexity of the model (length of the tree)
|T |,
|T |
Rα (fT ) = Rn (fT ) + α
n
Rn (fT ) is the error term linked to the decision subtree fT (ex:
Mean Squared error or the rate of misclassifications)
The complexity parameter α ∈ [0, 1] penalizes large trees
M. Tami (CentraleSupélec) Ensemble learning from theory to practice January - March, 2023 58 / 72
Context Basic idea and reminders Decision tree model and building Discussion References
M. Tami (CentraleSupélec) Ensemble learning from theory to practice January - March, 2023 60 / 72
Context Basic idea and reminders Decision tree model and building Discussion References
Plan
1 Context & motivations
Machine learning field and weak learners
Decision Tree: a supervised weak learner
2 Basic idea and reminders
Example of basic decision tree
Vocabulary and conventions
3 Decision tree model and building
Decision tree model
Building regression trees
Building classification trees
4 Discussion
M. Tami (CentraleSupélec) Ensemble learning from theory to practice January - March, 2023 61 / 72
Context Basic idea and reminders Decision tree model and building Discussion References
Building a classification DT
Let’s write Dk the data points in a node k ,
Dk = {(xi , yi ) ∈ D : xi ∈ Rk }
We consider risk functions addapted to classification
To split nodes, we want to choose a split yielding a maximal
impurity reduction to solve the optimization problem
Several measures Q (Dk ) of node impurity can be considered
I Misclassification error
1 X
Q (Dk ) = 1{yi 6=c(k )} = 1 − p̂k ,c(k ) (3)
nk
I Gini index xi ∈Rk
X C
X
Q (Dk ) = p̂k ,c p̂k ,c 0 = p̂k ,c (1 − p̂k ,c ) (4)
c6=c 0 c=1
I Cross-entropy
C
X
Q (Dk ) = − p̂k ,c ln (p̂k ,c ) (5)
c=1;p̂k ,c 6=0
M. Tami (CentraleSupélec) Ensemble learning from theory to practice January - March, 2023 64 / 72
Context Basic idea and reminders Decision tree model and building Discussion References
Most of the time it does not make a big difference and lead to
similar trees
Gini impurity is slightly faster to compute ⇒ it is a good default
However, when they differ, Gini impurity tends to isolate the most
frequent class in its own branch of the tree, while entropy tends to
produce slightly more balanced trees
M. Tami (CentraleSupélec) Ensemble learning from theory to practice January - March, 2023 66 / 72
Context Basic idea and reminders Decision tree model and building Discussion References
M. Tami (CentraleSupélec) Ensemble learning from theory to practice January - March, 2023 67 / 72
Context Basic idea and reminders Decision tree model and building Discussion References
M. Tami (CentraleSupélec) Ensemble learning from theory to practice January - March, 2023 68 / 72
Context Basic idea and reminders Decision tree model and building Discussion References
M. Tami (CentraleSupélec) Ensemble learning from theory to practice January - March, 2023 69 / 72
Context Basic idea and reminders Decision tree model and building Discussion References
M. Tami (CentraleSupélec) Ensemble learning from theory to practice January - March, 2023 70 / 72
Context Basic idea and reminders Decision tree model and building Discussion References
References I
Géron, A. (2017).
Hands-on machine learning with scikit-learn and tensorflow: Concepts.
Tools, and Techniques to build intelligent systems.
Zhou, Y. (2019).
Asymptotics and Interpretability of Decision Trees and Decision Tree Ensembles.
PhD thesis, Cornell University.
M. Tami (CentraleSupélec) Ensemble learning from theory to practice January - March, 2023 72 / 72