Professional Documents
Culture Documents
Decision Tree
Decision Tree
Decision Tree
Introduction
A decision tree is a graphical representation of specific decision
Introduction
can be used to clarify and find an answer to a complex problem
Helps to determine a course of action or show a statistical probability. Each
Type of nodes
Three types of nodes
Decision points (nodes) are connected together by arcs (one for each
alternative on a decision) and terminate in ovals (the action which is the
result of all of the decisions made on the path).
Example 1:
A bank has the following policies on deposit; on deposits of Rs
5000 and above , the interest is 12%. On same deposit for the
period less than 3 years, it is 10%. On deposit below Rs 5000.
The interest rate is 8%, regardless of the period of deposit.
Show the process using decision tree.
Depos
it>
= Rs 5
000
2
De
po
si t
<R
Period
>=
Per
io
s5
00
3 year
s
d<
ea
rs
Legends:
1
2
Deposit Made
Period duration
Give 8 % interest
Example 2
Target
Predictors
Outlook
Temper
Humidity
Windy
Play Golf
Rainy
Hot
High
FALSE
No
Rainy
Hot
High
TRUE
No
Overoact
Hot
High
FALSE
Yes
Sunny
Mild
High
FALSE
Yes
Sunny
Cool
Normal
FALSE
Yes
Sunny
Cool
Normal
TRUE
Yes
Overoast
Cool
Normal
TRUE
No
Rainy
Mild
High
FALSE
Yes
Rainy
Cool
Normal
FALSE
No
Sunny
Mild
Normal
FALSE
Yes
Rainy
Mild
Normal
TRUE
Yes
Overoast
Mild
High
TRUE
Yes
Overoast
Hot
Normal
FALSE
Yes
Sunny
Mild
High
TRUE
No
Algorithms
Many algorithms are available for constructing a decision
tree. Some of the notable among them are ID3, C 4.5, CART,
CHAID, MARS .
The core algorithm for building decision trees calledID3by J.
R. Quinlan which employs a top-down, greedy search through
the space of possible branches with no backtracking. ID3
usesEntropyandInformation Gainto construct a decision
tree.
Entropy
Entropy is a measure of uncertainty in the
data
S = set of examples
c = size of the range of the target attribute
Information gain
The information gain is based on the decrease in entropy after a dataset is split
on an attribute. Constructing a decision tree is all about finding attribute that
returns the highest information gain (i.e., the most homogeneous branches).
of the
The attribute with the highest Information gain is used as root node
Step2
The dataset is then split on the different attributes. The
Step 3
Step 4
A branch with entropy
of 0 is a leaf node
A branch with
Step 5
The algorithm is run
recursively on the
non-leaf branches,
until all data is
classified
References
http://www.saedsayad.com/
https://en.wikipedia.org/wiki/Decision_tree
slideshare.net