Professional Documents
Culture Documents
Taller # 15 IA
Taller # 15 IA
1. Use the ID3 algorithm in order to determine the best attribute for the root node of the decision
tree from the training dataset shown in Table 1, where ‘burn’ is the class attribute. For each
candidate attribute you must indicate the mathematical procedure used to calculate the
entropy and the information gained by choosing that attribute as the root node.
D(Hair) = {“blond”, “brown”, “red”}
D(Height) = {“average”, “short”, “tall”}
D(Protection) = {“yes”, “no”}
D(Burn) = {“yes”, “no”}
Burn:
P(“yes”.burn|E) = 3 / 8
P(“no”.burn|E) = 5 / 8
I(Burn) = -3/8*log2(3/8)-5/8*log2(5/8)
I(Burn) = 0.954434003
Protection:
P(“yes”.burn |”yes”.protection) = 0/3
P(“no”.burn |”yes”.protection) = 3/3
P(“yes”.burn |”no”.protection) = 3/5
P(“no”.burn |”no”.protection) = 2/5
I(Burn | “yes”.protection) = -0/3*log2(0/3)-3/3*log2(3/3)
I(Burn | “yes”.protection) = 0
I(Burn | “no”.protection) = -3/5*log2(3/5)-2/5*log2(2/5)
I(Burn | “no”.protection) = 0.970950594
P(“yes”.protection | E) = 3 / 8
P(“no”.protection | E) = 5 / 8
I(Protection) = 3/8 * 0 + 5/8 * 0.970950594
I(Protection) = 0.606844121
Height
P(“yes”.burn | “average”) = 2/3
P(“no”.burn | “average”) = 1/3
P(“yes”.burn | “short”) = 1/3
P(“no”.burn | “short”) = 2/3
P(“yes”.burn | “tall”) = 0/2
P(“no”.burn | “tall”) = 2/2
I(Burn | “average”) = -2/3*log2(2/3)-1/3*log2(1/3)
I(Burn | “average”) = 0.918295834
I(Burn | “short”) = -1/3*log2(1/3)-2/3*log2(2/3)
I(Burn | “short”) = 0.918295834
I(Burn | “tall”) = -0/2*log2(0/2)-2/2*log2(2/2)
I(Burn | “tall”) = 0
P(“average” | E) = 3/8
P(“short” | E) = 3/8
P(“tall” | E) = 2/8
I(Height) = 3/8*0.918295834 + 3/8*0.918295834 + 0*2/8
I(Height) = 0.688721875
Hair
P(“yes”| “blond”) = 2/4
P(“no”| “blond”) = 2/4
P(“yes” | “brown”) = 0/3
P(“no” | “brown”) = 3/3
P(“yes” | “red”) = 1/1
P(“no” | “red”) = 0/1
I(Burn | “blond”) = -2/4*log2(2/4)-2/4*log2(2/4)
I(Burn | “blond”) = 1
I(Burn | “brown”) = -0/3*log2(0/3)-3/3*log2(3/3)
I(Burn | “brown”) = 0
I(Burn | “red”) = -1/1*log2(1/1)-0/1*log2(0/1)
I(Burn | “red”) = 0
P(“blond” | E) = 4/8
P(“brown” | E) = 3/8
P(“red” | E) = 1/8
I(Hair) = 4/8 * 1 + 3/8 * 0 + 1/8 * 0
I(Hair) = 0.5
Ganancia(protection) = 0.954434003 - 0.606844121 = 0.347589882
Ganancia(Height) = 0.954434003 - 0.688721875 = 0.265712128
Ganancia(Hair) = 0.954434003 - 0.5 = 0.454434003
The best attribute for the root node is the “Hair”