Decision Tree

Decision Tree
Introduction
A decision tree is a graphical representation of specific decision
situations that are used when complex branching occurs in a

structured decision process
Each branch of the decision tree represents a possible decision
or option.
A way to display algorithm
Can be used as visual aids to structure and solve sequential
decision problems
Especially beneficial when the complexity of the problem grows
Introduction
can be used to clarify and find an answer to a complex problem
Helps to determine a course of action or show a statistical probability. Each
branch of the decision tree represents a possible decision or occurrence

allows users to take a problem with multiple possible solutions and display
it in a simple, easy-to-understand format that shows the relationship
between different events or decisions.
organized in a hierarchical fashion, starting with the root node on the far
left, and proceeding to subsequent decision nodes. All possible actions are
listed in leaf nodes on the far right.
shows how one choice leads to the next, and the use of branches indicates
that each option is mutually exclusive
Type of nodes
Three types of nodes
Decision nodes - represented by squares ()

Chance nodes - represented by circles ()
Terminal nodes - represented by triangles (optional)
Decision points (nodes) are connected together by arcs (one for each
alternative on a decision) and terminate in ovals (the action which is the
result of all of the decisions made on the path).
Example 1:
A bank has the following policies on deposit; on deposits of Rs
5000 and above , the interest is 12%. On same deposit for the
period less than 3 years, it is 10%. On deposit below Rs 5000.
The interest rate is 8%, regardless of the period of deposit.
Show the process using decision tree.
Depos
it>
= Rs 5
000
2
De
po
si t
<R
Period
>=
Per
io
s5
00
3 year
s
d<
Give 12% interest

3y
ea
rs
Give 10% interest
Legends:
1
2
Deposit Made
Period duration
Give 8 % interest
Example 2
Target
Predictors
Draw a decision tree with the data

Given as following?
Outlook
Temper
Humidity
Windy
Play Golf
Rainy
Hot
High
FALSE
No
Rainy
Hot
High
TRUE
No
Overoact
Hot
High
FALSE
Yes
Sunny
Mild
High
FALSE
Yes
Sunny
Cool
Normal
FALSE
Yes
Sunny
Cool
Normal
TRUE
Yes
Overoast
Cool
Normal
TRUE
No
Rainy
Mild
High
FALSE
Yes
Rainy
Cool
Normal
FALSE
No
Sunny
Mild
Normal
FALSE
Yes
Rainy
Mild
Normal
TRUE
Yes
Overoast
Mild
High
TRUE
Yes
Overoast
Hot
Normal
FALSE
Yes
Sunny
Mild
High
TRUE
No
Algorithms
Many algorithms are available for constructing a decision
tree. Some of the notable among them are ID3, C 4.5, CART,
CHAID, MARS .
The core algorithm for building decision trees calledID3by J.
R. Quinlan which employs a top-down, greedy search through
the space of possible branches with no backtracking. ID3
usesEntropyandInformation Gainto construct a decision
tree.
CONSTRUCTING A DECISION TREE

Two
Which attribute to choose?
Aspects
Information Gain
ENTROPY
Where to stop?
Termination criteria
Entropy
Entropy is a measure of uncertainty in the
data
S = set of examples
c = size of the range of the target attribute
Information gain
The information gain is based on the decrease in entropy after a dataset is split
on an attribute. Constructing a decision tree is all about finding attribute that
returns the highest information gain (i.e., the most homogeneous branches).
Information gain = Entropy of the system before split Entropy
of the
system after split
The attribute with the highest Information gain is used as root node
Step 1. Calculate Entropy of target

( before splitting)
Step2
The dataset is then split on the different attributes. The
entropy for each branch is calculated. Then it is added

proportionally, to get total entropy for the split. The resulting
entropy is subtracted from the entropy before the split. The
result is the Information Gain, or decrease in entropy.
Entropy after Split for attribute

Outlook
In Similar way Gain of

other
predictors can be
calculated.
Step 3
Choose the attributes
with highest gain as root

node
Step 4
A branch with entropy
of 0 is a leaf node
A branch with
entropy more than

0 needs further
splitting
Step 5
The algorithm is run
recursively on the
non-leaf branches,
until all data is
classified
References
http://www.saedsayad.com/
https://en.wikipedia.org/wiki/Decision_tree
slideshare.net

Decision Tree

Uploaded by

Copyright:

Available Formats

You might also like

Decision Tree

Uploaded by

Document Information

Copyright

Available Formats

Share this document

Share or Embed Document

Sharing Options

Did you find this document useful?

Is this content inappropriate?

Copyright:

Available Formats

Decision Tree

Uploaded by

Copyright:

Available Formats

Decision Tree

situations that are used when complex branching occurs in a

branch of the decision tree represents a possible decision or occurrence

Decision nodes - represented by squares ()

Give 12% interest

Give 10% interest

Draw a decision tree with the data

CONSTRUCTING A DECISION TREE

Information gain = Entropy of the system before split Entropy

system after split

Step 1. Calculate Entropy of target

entropy for each branch is calculated. Then it is added

Entropy after Split for attribute

In Similar way Gain of

Choose the attributes

with highest gain as root

entropy more than

You might also like