D (I, J) R+s Q +R +S+T

You might also like

Download as docx, pdf, or txt
Download as docx, pdf, or txt
You are on page 1of 5

STEP 1 : Represent the binary attributes by 0 and 1.

STEP 2 : Represent the ordinal attributes by their respective ranks and then normalize it.

STEP 3 : In order to find the distances among all the objects ( columnwise ) , first we will find the
dissimilarities according to the individual attributes, and then we will combine the results to find the
overall distance :-

CALCULATING THE DISTANCES ACCORDING TO THE BINARY ATTRIBUTES :-

If the objects i and j are represented by symmetric binary attributes, then the dissimilarity between i
and j is represented by :

r+ s
d ( i, j ) =
q +r + s+t
where,

q= no. of attributes that equal 1 for both object,

r = no. of attributes that equal 1 for i and 0 for j,

s = no. of attributes that equal 0 for i and 1 for j,

t = no. of attributes that equal 0 for both

for example, for i=2 and j=4 the value of both the project titles is 1. Hence, q=1, r=s=t=0

0+0
Hence, d (2,4) = =0
0+1+0+0
In a similar way, we can calculate the dissimilarity matrix, we get :

i, 1 2 3 4 5 6 7 8 9 10
j
1 0
2 1 0
3 0 1 0
4 1 0 1 0
5 1 0 1 0 0
6 1 0 1 0 0 0
7 1 0 1 0 0 0 0
8 1 0 1 0 0 0 0 0
9 1 0 1 0 0 0 0 0 0
10 1 0 1 0 0 0 0 0 0 0

NOTE : Since the distance matrix will be symmetric, I have filled only the half of the matrix,

So, that it looks somewhat clear to see.


CALCULATING THE DISTANCES AS PER ORDINAL ATTRIBUTES :-

In carculating the dissimilarity for the ordinal attributes, we first divide the data into several ranks.

Then, we normalize the data.

Then, we find the dissimilarity between tuple i and j by calculating the Euclidean distance.

For example :

For i=2, we have proj desc = 0.75 and for j=4, we have proj desc = 0.25

Hence, d(i , j ) = √ (0.75−0.25)2 = 0.5

i, 1 2 3 4 5 6 7 8 9 10
j
1 0
2 0.75 0
3 1.00 0.25 0
4 0.25 0.50 0.75 0
5 0.75 0.00 0.25 0.50 0
6 0.75 0.00 0.25 0.50 0.00 0
7 0.75 0.00 0.25 0.50 0.00 0.00 0
8 0.25 0.50 0.75 0.00 0.5 0.5 0.5 0
9 0.50 0.25 0.50 0.25 0.25 0.25 0.25 0.25 0
10 0.50 0.25 0.50 0.25 0.25 0.25 0.25 0.25 0.00 0
CALCULATING THE DISTANCE BETWEEN THE NUMERIC ATTRIBUTE :

The distance between numberic attributes can be calculate as the Eucledian Distance or Manhattan
Distance. In this case, since we have only one numberic attribute,both of these will be same.

The Euclidean distance between objects i and j is defined as :

The Manhattan distance between object I and j is defined as :

i, 1 2 3 4 5 6 7 8 9 10
j
1 0
2 0.8896 0
3 0.3961 0.4935 0
4 0.8896 0.0000 0.4935 0
5 0.7857 0.1039 0.3896 0.1039 0
6 0.8182 0.0714 0.4221 0.0714 0.0325 0
7 0.8701 0.0195 0.4740 0.0195 0.0844 0.0519 0
8 0.7857 0.1039 0.3896 0.1039 0.0000 0.0324 0.0844 0
9 1.0000 0.1104 0.6039 0.1104 0.2143 0.1818 0.1299 0.2143 0
10 0.8376 0.0519 0.441 0.0519 0.0519 0.1948 0.0325 0.519 0.1623 0
CALCULATING THE DISTANCE BETWEEN THE CONTINUOUS ATTRIBUTE :

The distance between continuous attributes can also be calculate as the Eucledian Distance or
Manhattan Distance. In this case, since we have only one numberic attribute,both of these will be
same.

The Euclidean distance between objects i and j is defined as :

The Manhattan distance between object I and j is defined as :

i, 1 2 3 4 5 6 7 8 9 1
j 0
1 0
2 0.595833 0
3 0.154167 0.75 0
4 0.604167 0.0083334 0.75833 0
4 3
5 0.25 1 0.241667 0

6 0.0374999 0.045833 0.2875 0


0.558333 0.7125
3

7 0.608333 0.0125 0.004166 0.2375 0.05 0


0.7625
6

8 0.554167 0.0416666 0.70833 0.050000 0.29166 0.0041667 0


0.054166
3 1 7 2
7

9 0.55 0.0458332 0.70416 0.058333 0.004166 0


0.054166 0.29583 0.0083333
7 3 6
7 3 3
1 0.070833 0.533333 0.5375 0.483333 0.47916 0
0.525 0.225 0.775 0.4875
0 3 7

CALCULATION OF OVERALL DISTANCE BETWEEN THE OBJECTS, CONSIDERING ALL THE MIXED TYPE
ATTRIBUTES SHOW ABOVE :-

For Example : Lets take i = 3 and j=9. Then the overall distance between objects i and j can be
calculated as :

1 ( 1 ) +1 ( 0.50 ) +1 ( 0.6039 )+ 1(0.704167)


d(i, j) = =¿ 0.70201675
1+ 1+ 1+ 1
In a similar way, the distance matrix can be calculated :-

i, 1 2 3 4 5 6 7 8 9 10
j
1 0

2 0.808861 0

3 0.387568 0.623377 0

4 0.685944 0.127083 0.75046 0

5 0.845387 0.088474 0.659903 0.211391 0

6 0.781629 0.0272321 0.596144 0.154315 0.0799919 0

7 0.807116 0.00799513 0.621632 0.130912 0.0804789 0.025487 0

8 0.64747 0.161391 0.711986 0.0384741 0.197917 0.134159 0.159646 0

9 0.7625 0.101556 0.702016 0.103639 0.19003 0.110038 0.109551 0.117113 0

10 0.602124 0.206737 0.54164 0.20882 0.269237 0.189245 0.204992 0.19632 0.160376 0

You might also like