Professional Documents
Culture Documents
Name: Syed Rehanuddin Quadri
Name: Syed Rehanuddin Quadri
Entropy: ) ,..., (
...
E(A) 1
1
1
mj j
v
j
mj j
s s I
s
s s
Information gained: E(A) ) s ,..., s , I(s Gain(A) m 2 1
Height:
Range Male Female Information
<5.6 1 2 0.917
5.6 to 5.11 3 2 0.971
5.11 2 2 1
6 6 1
Entropy 0.967
Information Gain (H ) 1- 0.967=0.033
Weight:
Range Male Female Information
<160 0 4 0
160 to 179 2 0 0
>179 4 2 0.917
6 6 1
Entropy 0.459
Information Gain (W) 1- 0.459=0.541
Income:
Range Male Female Information
<50k 3 2 0.971
50k to 80k 1 2 0.917
> 80k 2 2 1
6 6 1
Entropy 0.967
Information Gain (I) 1- 0.967=0.033
Foot size:
Range Male Female Information
<9 0 3 0
<9 to >11 3 3 1
>11 3 0 0
6 6 1
Entropy 0.5
Information Gain (I) 1- 0.5=0.5
Decision tree:
Decision tree mainly concentrates on classification or regression models which are in the form of a tree structure. It can
breakdown the dataset into as small subsets that are possible Also, an associated decision tree is developed. Tree with
decision nodes and leaf nodes is the final result.
From the given data set we can get the following decision tree.
Income
<50k
Gender
Male
Weight
Height
Foot size
Female
Weight
Height
Foot size
50k to
80k
Gender
Male
Weight
Height
Foot size
Female
Weight
Height
Foot size
> 80k
Gender
Male
Weight
Height
Foot size
Female
Weight
Height
Foot size