Download as ppt, pdf, or txt
Download as ppt, pdf, or txt
You are on page 1of 22

Iterative Dichotomiser 3 (ID3)

Algorithm

Medha Pradhan
CS 157B, Spring 2007
Agenda

 Basics of Decision Tree


 Introduction to ID3
 Entropy and Information Gain
 Two Examples
Basics
 What is a decision tree?
A tree where each branching
(decision) node represents a
choice between 2 or more
alternatives, with every
branching node being part
of a path to a leaf node

 Decision node: Specifies a


test of some attribute
 Leaf node: Indicates
classification of an example
ID3
 Invented by J. Ross Quinlan
 Employs a top-down greedy search through
the space of possible decision trees.
Greedy because there is no backtracking. It
picks highest values first.
 Select attribute that is most useful for
classifying examples (attribute that has the
highest Information Gain).
Entropy
 Entropy measures the impurity of an arbitrary collection of
examples.
 For a collection S, entropy is given as:

 For a collection S having positive and negative examples


Entropy(S) = -p+log2p+ - p-log2p-
where p+ is the proportion of positive examples
and p- is the proportion of negative examples
In general, Entropy(S) = 0 if all members of S belong to the same
class.
Entropy(S) = 1 (maximum) when all members are split equally.
Information Gain
 Measures the expected reduction in entropy. The
higher the IG, more is the expected reduction in
entropy.

where Values(A) is the set of all possible values


for attribute A,
Sv is the subset of S for which attribute A has
value v.
Example 1
Sample training data to determine whether an animal lays
eggs.
Dependent/
Independent/Condition attributes Decision
attributes
Animal Warm- Feathers Fur Swims Lays Eggs
blooded
Ostrich Yes Yes No No Yes

Crocodile No No No Yes Yes

Raven Yes Yes No No Yes

Albatross Yes Yes No No Yes

Dolphin Yes No No Yes No

Koala Yes No Yes No No


Entropy(4Y,2N): -(4/6)log2(4/6) – (2/6)log2(2/6)
= 0.91829

Now, we have to find the IG for all four attributes


Warm-blooded, Feathers, Fur, Swims
For attribute ‘Warm-blooded’:
Values(Warm-blooded) : [Yes,No]
S = [4Y,2N]
SYes = [3Y,2N] E(SYes) = 0.97095
SNo = [1Y,0N] E(SNo) = 0 (all members belong to same class)
Gain(S,Warm-blooded) = 0.91829 – [(5/6)*0.97095 + (1/6)*0]
= 0.10916
For attribute ‘Feathers’:
Values(Feathers) : [Yes,No]
S = [4Y,2N]
SYes = [3Y,0N] E(SYes) = 0
SNo = [1Y,2N] E(SNo) = 0.91829
Gain(S,Feathers) = 0.91829 – [(3/6)*0 + (3/6)*0.91829]
= 0.45914
For attribute ‘Fur’:
Values(Fur) : [Yes,No]
S = [4Y,2N]
SYes = [0Y,1N] E(SYes) = 0
SNo = [4Y,1N] E(SNo) = 0.7219
Gain(S,Fur) = 0.91829 – [(1/6)*0 + (5/6)*0.7219]
= 0.3167
For attribute ‘Swims’:
Values(Swims) : [Yes,No]
S = [4Y,2N]
SYes = [1Y,1N] E(SYes) = 1 (equal members in both classes)
SNo = [3Y,1N] E(SNo) = 0.81127
Gain(S,Swims) = 0.91829 – [(2/6)*1 + (4/6)*0.81127]
Gain(S,Warm-blooded) = 0.10916
Gain(S,Feathers) = 0.45914
Gain(S,Fur) = 0.31670
Gain(S,Swims) = 0.04411

Gain(S,Feathers) is maximum, so it is considered as the root node


Anim War Feath Fur Swim Lays
The ‘Y’ descendant has only
al m- ers s Eggs positive examples and becomes the
blood leaf node with classification ‘Lays
ed Eggs’
Ostric Yes Yes No No Yes Feathers
h
Croco No No No Yes Yes Y N
dile
Raven Yes Yes No No Yes
[Ostrich, Raven, [Crocodile, Dolphin,
Albatr Yes Yes No No Yes
oss Albatross] Koala]
Dolph Yes No No Yes No
in Lays Eggs ?
Koala Yes No Yes No No
Animal Warm- Feathers Fur Swims Lays Eggs
blooded
Crocodile No No No Yes Yes
Dolphin Yes No No Yes No
Koala Yes No Yes No No

We now repeat the procedure,


S: [Crocodile, Dolphin, Koala]
S: [1+,2-]

Entropy(S) = -(1/3)log2(1/3) – (2/3)log2(2/3)


= 0.91829
 For attribute ‘Warm-blooded’:
Values(Warm-blooded) : [Yes,No]
S = [1Y,2N]
SYes = [0Y,2N] E(SYes) = 0
SNo = [1Y,0N] E(SNo) = 0
Gain(S,Warm-blooded) = 0.91829 – [(2/3)*0 + (1/3)*0] = 0.91829

 For attribute ‘Fur’:


Values(Fur) : [Yes,No]
S = [1Y,2N]
SYes = [0Y,1N] E(SYes) = 0
SNo = [1Y,1N] E(SNo) = 1
Gain(S,Fur) = 0.91829 – [(1/3)*0 + (2/3)*1] = 0.25162

 For attribute ‘Swims’:


Values(Swims) : [Yes,No]
S = [1Y,2N]
SYes = [1Y,1N] E(SYes) = 1
SNo = [0Y,1N] E(SNo) = 0
Gain(S,Swims) = 0.91829 – [(2/3)*1 + (1/3)*0] = 0.25162
The final decision tree will be:

Feathers

Y N

Lays eggs Warm-blooded

Y N

Does not lay eggs Lays Eggs


Example 2
 Factors affecting sunburn
Name Hair Height Weight Lotion Sunburned
Sarah Blonde Average Light No Yes
Dana Blonde Tall Average Yes No
Alex Brown Short Average Yes No
Annie Blonde Short Average No Yes
Emily Red Average Heavy No Yes
Pete Brown Tall Heavy No No
John Brown Average Heavy No No
Katie Blonde Short Light Yes No
 S = [3+, 5-]
Entropy(S) = -(3/8)log2(3/8) – (5/8)log2(5/8)
= 0.95443

Find IG for all 4 attributes: Hair, Height, Weight, Lotion

 For attribute ‘Hair’:


Values(Hair) : [Blonde, Brown, Red]
S = [3+,5-]
SBlonde = [2+,2-] E(SBlonde) = 1
SBrown = [0+,3-] E(SBrown) = 0
SRed = [1+,0-] E(SRed) = 0
Gain(S,Hair) = 0.95443 – [(4/8)*1 + (3/8)*0 + (1/8)*0]
= 0.45443
 For attribute ‘Height’:
Values(Height) : [Average, Tall, Short]
SAverage = [2+,1-] E(SAverage) = 0.91829
STall = [0+,2-] E(STall) = 0
SShort = [1+,2-] E(SShort) = 0.91829
Gain(S,Height) = 0.95443 – [(3/8)*0.91829 + (2/8)*0 + (3/8)*0.91829]
= 0.26571
 For attribute ‘Weight’:

Values(Weight) : [Light, Average, Heavy]


SLight = [1+,1-] E(SLight) = 1
SAverage = [1+,2-] E(SAverage) = 0.91829
SHeavy = [1+,2-] E(SHeavy) = 0.91829
Gain(S,Weight) = 0.95443 – [(2/8)*1 + (3/8)*0.91829 + (3/8)*0.91829]
= 0.01571
 For attribute ‘Lotion’:

Values(Lotion) : [Yes, No]


SYes = [0+,3-] E(SYes) = 0
SNo = [3+,2-] E(SNo) = 0.97095
Gain(S,Lotion) = 0.95443 – [(3/8)*0 + (5/8)*0.97095]
Gain(S,Hair) = 0.45443
Gain(S,Height) = 0.26571
Gain(S,Weight) = 0.01571
Gain(S,Lotion) = 0.3475
Gain(S,Hair) is maximum, so it is considered as the root node
Name Hair Height Weigh Lotion Sunbur
t ned
Sarah Blonde Averag Light No Yes
e
Dana Blonde Tall Averag Yes No Hair
e
Blonde Brown
Alex Brown Short Averag Yes No
e Red
[Sarah, Dana, [Alex, Pete, John]
Annie Blonde Short Averag No Yes
e Annie, Katie]
Not
Emily Red Averag Heavy No Yes ? Sunburned
e
Pete Brown Tall Heavy No No
[Emily]

John Brown Averag Heavy No No


e Sunburned
Katie Blonde Short Light Yes No
Name Hair Height Weight Lotion Sunburned
Sarah Blonde Average Light No Yes
Dana Blonde Tall Average Yes No
Annie Blonde Short Average No Yes
Katie Blonde Short Light Yes No

Repeating again:
S = [Sarah, Dana, Annie, Katie]
S: [2+,2-]
Entropy(S) = 1

Find IG for remaining 3 attributes Height, Weight, Lotion


 For attribute ‘Height’:

Values(Height) : [Average, Tall, Short]


S = [2+,2-]
SAverage = [1+,0-] E(SAverage) = 0
STall = [0+,1-] E(STall) = 0
SShort = [1+,1-] E(SShort) = 1
Gain(S,Height) = 1 – [(1/4)*0 + (1/4)*0 + (2/4)*1]
= 0.5
 For attribute ‘Weight’:
Values(Weight) : [Average, Light]
S = [2+,2-]
SAverage = [1+,1-] E(SAverage) = 1
SLight = [1+,1-] E(SLight) = 1
Gain(S,Weight) = 1 – [(2/4)*1 + (2/4)*1]
=0

 For attribute ‘Lotion’:


Values(Lotion) : [Yes, No]
S = [2+,2-]
SYes = [0+,2-] E(SYes) = 0
SNo = [2+,0-] E(SNo) = 0
Gain(S,Lotion) = 1 – [(2/4)*0 + (2/4)*0]
=1
 In this case, the final decision tree will be

Hair

Blonde Brown
Red

Sunburned Not
Lotion
Sunburned

Y N

Not Sunburned
Sunburned
References
 "Machine Learning", by Tom Mitchell, McGraw-Hill, 1997

 "Building Decision Trees with the ID3 Algorithm", by:


Andrew Colin, Dr. Dobbs Journal, June 1996

 http://www2.cs.uregina.ca/~dbd/cs831/notes/ml/dtrees/dt_pro
b1.html

 Professor Sin-Min Lee, SJSU.


http://cs.sjsu.edu/~lee/cs157b/cs157b.html

You might also like