Download as pdf or txt
Download as pdf or txt
You are on page 1of 4

Data Mining – Solved Example 01

Which feature will be at the root node of the decision tree trained for
the following data? Then draw the Decision Tree (DT) for this training
data.
Decision
Employed Age Have House
(Give Credit)
No Middle Age No No
Yes Old No No
Yes Middle Age Yes Yes
Yes Old Yes No
No Old Yes No
Yes Young Yes Yes
Yes Middle Age No No
No Middle Age Yes Yes

Use your DT to classify the following:


Decision
Employed Age Have House
(Give Credit)
No Young Yes ?
No Old No ?
Yes Young No ?

Solution:
Step 1: Find the Entropy of the parent node (in this example it is Decision):
Entropy parent = -5/8 log 2 (5/8) -3/8 log 2 (3/8) = 0.954

Yes=3

No= 5

Note: To calculate log2(a/b), either use log10(a/b)/log10(2) or log (a/b, 2)

1
Step 2: Find the Information Gain of the children nodes:
*Information Gain= Entropy Parent – [Average children Entropy]

1- Employed Attribute:
Yes=1

No = 2
Yes=3
No= 5
Yes=2

No= 3

Entropy No = -1/3 log2 (1/3) -2/3 log2 (2/3) = 0.918


Entropy Yes = -3/5 log2 (3/5) -2/5 log2 (2/5) 0.971
Information Gain Employed= 0.954 – [3/8 * 0.918 + 5/8 * 0.971] = 0.003

2- Age Attribute:

Yes=2

No= 2

Yes=3
No= 3
No= 5

Yes=1

Entropy Middle Age = -2/4 log2 (2/4) -2/4 log2 (2/4)


= 1Entropy old = -3/3 log2 (3/3) = 0
Entropy Young = -1/1 log2 (1/1) = 0
Information Gain Age= 0.954 – [4/8 * 1 + 3/8 * 0 + 1/8 * 0] = 0.454

2
3- Have House Attribute:

No=3

Yes=3
No= 5
Yes=3
No=2

Entropy No = -3/3 log2 (3/3) = 0


Entropy yes = -3/5 log2 (3/5) -2/5 log2 (2/5) = 0.97
Information Gain Have House= 0.954 – [3/8 * 0 + 5/8 * 0.971] = 0.347

After calculating the Information Gain (IG) values of the children


attributes, we select the root node to be the Age attribute since it has the
highest IG  0.454 , following Age, we select the Have House attribute
(with IG=0.347), and then we eliminated Employed attribute that has a
negligible value  0.003

Step 3: Drawing the decision tree:

Employed Age Have House Decision (Give Credit)


No Young Yes Yes
No Old No No
Yes Young No Yes

3
4

You might also like