Download as pdf or txt
Download as pdf or txt
You are on page 1of 3

Taller # 15 IA

Santiago Varela Daza


svralea03@uan.edu.co

1. Use the ID3 algorithm in order to determine the best attribute for the root node of the decision
tree from the training dataset shown in Table 1, where ‘burn’ is the class attribute. For each
candidate attribute you must indicate the mathematical procedure used to calculate the
entropy and the information gained by choosing that attribute as the root node.

Hair Height Protection Burn


blond average no yes
blond tall yes no
brown short yes no
blond short no yes
red average no yes
brown tall no no
brown average no no
blond short yes no
Table 1.​ Training data.

 
D(Hair) = {“blond”, “brown”, “red”} 
D(Height) = {“average”, “short”, “tall”} 
D(Protection) = {“yes”, “no”} 
D(Burn) = {“yes”, “no”} 
 
Burn: 
 
P(“yes”.burn|E) = 3 / 8 
P(“no”.burn|E) = 5 / 8 
 
I(Burn) = -3/8*log2(3/8)-5/8*log2(5/8) 
I(Burn) = 0.954434003 
 
Protection: 
 
P(“yes”.burn |”yes”.protection) = 0/3 
P(“no”.burn |”yes”.protection) = 3/3 
 
P(“yes”.burn |”no”.protection) = 3/5 
P(“no”.burn |”no”.protection) = 2/5 
 
I(Burn | “yes”.protection) = -0/3*log2(0/3)-3/3*log2(3/3) 
I(Burn | “yes”.protection) = 0 
I(Burn | “no”.protection) = -3/5*log2(3/5)-2/5*log2(2/5) 
I(Burn | “no”.protection) = 0.970950594 
 
P(“yes”.protection | E) = 3 / 8 
P(“no”.protection | E) = 5 / 8 
 
 
I(Protection) = 3/8 * 0 + 5/8 * 0.970950594 
I(Protection) = 0.606844121 
 
Height 
 
P(“yes”.burn | “average”) = 2/3  
P(“no”.burn | “average”) = 1/3 
 
P(“yes”.burn | “short”) = 1/3  
P(“no”.burn | “short”) = 2/3 
 
P(“yes”.burn | “tall”) = 0/2  
P(“no”.burn | “tall”) = 2/2 
 
I(Burn | “average”) = -2/3*log2(2/3)-1/3*log2(1/3) 
I(Burn | “average”) = 0.918295834 
 
I(Burn | “short”) = -1/3*log2(1/3)-2/3*log2(2/3) 
I(Burn | “short”) = 0.918295834  
 
I(Burn | “tall”) = -0/2*log2(0/2)-2/2*log2(2/2) 
I(Burn | “tall”) = 0 
 
P(“average” | E) = 3/8 
P(“short” | E) = 3/8 
P(“tall” | E) = 2/8 
 
I(Height) = 3/8*0.918295834 + 3/8*0.918295834 + 0*2/8 
I(Height) = 0.688721875 
 
 
 
 
 
 
 
Hair 
 
P(“yes”| “blond”) = 2/4 
P(“no”| “blond”) = 2/4 
 
 
P(“yes” | “brown”) = 0/3 
P(“no” | “brown”) = 3/3  
 
P(“yes” | “red”) = 1/1 
P(“no” | “red”) = 0/1  
 
I(Burn | “blond”) = -2/4*log2(2/4)-2/4*log2(2/4) 
I(Burn | “blond”) = 1 
 
I(Burn | “brown”) = -0/3*log2(0/3)-3/3*log2(3/3) 
I(Burn | “brown”) = 0 
 
I(Burn | “red”) = -1/1*log2(1/1)-0/1*log2(0/1) 
I(Burn | “red”) = 0 
 
P(“blond” | E) = 4/8 
P(“brown” | E) = 3/8  
P(“red” | E) = 1/8 
 
I(Hair) = 4/8 * 1 + 3/8 * 0 + 1/8 * 0 
I(Hair) = 0.5 
 
Ganancia(protection) = 0.954434003 - 0.606844121 = 0.347589882 
Ganancia(Height) = 0.954434003 - 0.688721875 = 0.265712128 
 
Ganancia(Hair) = 0.954434003 - 0.5 = 0.454434003  
 
The best attribute for the root node is the “Hair” 

You might also like