Decision Tree

You might also like

Download as pptx, pdf, or txt
Download as pptx, pdf, or txt
You are on page 1of 20

Decision Tree

Introduction
A decision tree is a graphical representation of specific decision

situations that are used when complex branching occurs in a


structured decision process
Each branch of the decision tree represents a possible decision
or option.
A way to display algorithm
Can be used as visual aids to structure and solve sequential
decision problems
Especially beneficial when the complexity of the problem grows

Introduction
can be used to clarify and find an answer to a complex problem
Helps to determine a course of action or show a statistical probability. Each

branch of the decision tree represents a possible decision or occurrence


allows users to take a problem with multiple possible solutions and display
it in a simple, easy-to-understand format that shows the relationship
between different events or decisions.
organized in a hierarchical fashion, starting with the root node on the far
left, and proceeding to subsequent decision nodes. All possible actions are
listed in leaf nodes on the far right.
shows how one choice leads to the next, and the use of branches indicates
that each option is mutually exclusive

Type of nodes
Three types of nodes

Decision nodes - represented by squares ()


Chance nodes - represented by circles ()
Terminal nodes - represented by triangles (optional)

Decision points (nodes) are connected together by arcs (one for each
alternative on a decision) and terminate in ovals (the action which is the
result of all of the decisions made on the path).

Example 1:
A bank has the following policies on deposit; on deposits of Rs

5000 and above , the interest is 12%. On same deposit for the
period less than 3 years, it is 10%. On deposit below Rs 5000.
The interest rate is 8%, regardless of the period of deposit.
Show the process using decision tree.

Depos
it>

= Rs 5
000

2
De
po

si t
<R

Period
>=
Per
io

s5
00

3 year
s
d<

Give 12% interest


3y

ea
rs

Give 10% interest

Legends:
1
2

Deposit Made
Period duration

Give 8 % interest

Example 2
Target

Predictors

Draw a decision tree with the data


Given as following?

Outlook

Temper

Humidity

Windy

Play Golf

Rainy

Hot

High

FALSE

No

Rainy

Hot

High

TRUE

No

Overoact

Hot

High

FALSE

Yes

Sunny

Mild

High

FALSE

Yes

Sunny

Cool

Normal

FALSE

Yes

Sunny

Cool

Normal

TRUE

Yes

Overoast

Cool

Normal

TRUE

No

Rainy

Mild

High

FALSE

Yes

Rainy

Cool

Normal

FALSE

No

Sunny

Mild

Normal

FALSE

Yes

Rainy

Mild

Normal

TRUE

Yes

Overoast

Mild

High

TRUE

Yes

Overoast

Hot

Normal

FALSE

Yes

Sunny

Mild

High

TRUE

No

Algorithms
Many algorithms are available for constructing a decision

tree. Some of the notable among them are ID3, C 4.5, CART,
CHAID, MARS .
The core algorithm for building decision trees calledID3by J.
R. Quinlan which employs a top-down, greedy search through
the space of possible branches with no backtracking. ID3
usesEntropyandInformation Gainto construct a decision
tree.

CONSTRUCTING A DECISION TREE


Two
Which attribute to choose?
Aspects
Information Gain
ENTROPY
Where to stop?
Termination criteria

Entropy
Entropy is a measure of uncertainty in the
data

S = set of examples
c = size of the range of the target attribute

Information gain
The information gain is based on the decrease in entropy after a dataset is split
on an attribute. Constructing a decision tree is all about finding attribute that
returns the highest information gain (i.e., the most homogeneous branches).

Information gain = Entropy of the system before split Entropy

of the

system after split

The attribute with the highest Information gain is used as root node

Step 1. Calculate Entropy of target


( before splitting)

Step2
The dataset is then split on the different attributes. The

entropy for each branch is calculated. Then it is added


proportionally, to get total entropy for the split. The resulting
entropy is subtracted from the entropy before the split. The
result is the Information Gain, or decrease in entropy.

Entropy after Split for attribute


Outlook

In Similar way Gain of


other
predictors can be
calculated.

Step 3

Choose the attributes

with highest gain as root


node

Step 4
A branch with entropy
of 0 is a leaf node

A branch with

entropy more than


0 needs further
splitting

Step 5
The algorithm is run
recursively on the
non-leaf branches,
until all data is
classified

References

http://www.saedsayad.com/
https://en.wikipedia.org/wiki/Decision_tree
slideshare.net

You might also like