Professional Documents
Culture Documents
D (I, J) R+s Q +R +S+T
D (I, J) R+s Q +R +S+T
D (I, J) R+s Q +R +S+T
STEP 2 : Represent the ordinal attributes by their respective ranks and then normalize it.
STEP 3 : In order to find the distances among all the objects ( columnwise ) , first we will find the
dissimilarities according to the individual attributes, and then we will combine the results to find the
overall distance :-
If the objects i and j are represented by symmetric binary attributes, then the dissimilarity between i
and j is represented by :
r+ s
d ( i, j ) =
q +r + s+t
where,
for example, for i=2 and j=4 the value of both the project titles is 1. Hence, q=1, r=s=t=0
0+0
Hence, d (2,4) = =0
0+1+0+0
In a similar way, we can calculate the dissimilarity matrix, we get :
i, 1 2 3 4 5 6 7 8 9 10
j
1 0
2 1 0
3 0 1 0
4 1 0 1 0
5 1 0 1 0 0
6 1 0 1 0 0 0
7 1 0 1 0 0 0 0
8 1 0 1 0 0 0 0 0
9 1 0 1 0 0 0 0 0 0
10 1 0 1 0 0 0 0 0 0 0
NOTE : Since the distance matrix will be symmetric, I have filled only the half of the matrix,
In carculating the dissimilarity for the ordinal attributes, we first divide the data into several ranks.
Then, we find the dissimilarity between tuple i and j by calculating the Euclidean distance.
For example :
For i=2, we have proj desc = 0.75 and for j=4, we have proj desc = 0.25
i, 1 2 3 4 5 6 7 8 9 10
j
1 0
2 0.75 0
3 1.00 0.25 0
4 0.25 0.50 0.75 0
5 0.75 0.00 0.25 0.50 0
6 0.75 0.00 0.25 0.50 0.00 0
7 0.75 0.00 0.25 0.50 0.00 0.00 0
8 0.25 0.50 0.75 0.00 0.5 0.5 0.5 0
9 0.50 0.25 0.50 0.25 0.25 0.25 0.25 0.25 0
10 0.50 0.25 0.50 0.25 0.25 0.25 0.25 0.25 0.00 0
CALCULATING THE DISTANCE BETWEEN THE NUMERIC ATTRIBUTE :
The distance between numberic attributes can be calculate as the Eucledian Distance or Manhattan
Distance. In this case, since we have only one numberic attribute,both of these will be same.
i, 1 2 3 4 5 6 7 8 9 10
j
1 0
2 0.8896 0
3 0.3961 0.4935 0
4 0.8896 0.0000 0.4935 0
5 0.7857 0.1039 0.3896 0.1039 0
6 0.8182 0.0714 0.4221 0.0714 0.0325 0
7 0.8701 0.0195 0.4740 0.0195 0.0844 0.0519 0
8 0.7857 0.1039 0.3896 0.1039 0.0000 0.0324 0.0844 0
9 1.0000 0.1104 0.6039 0.1104 0.2143 0.1818 0.1299 0.2143 0
10 0.8376 0.0519 0.441 0.0519 0.0519 0.1948 0.0325 0.519 0.1623 0
CALCULATING THE DISTANCE BETWEEN THE CONTINUOUS ATTRIBUTE :
The distance between continuous attributes can also be calculate as the Eucledian Distance or
Manhattan Distance. In this case, since we have only one numberic attribute,both of these will be
same.
i, 1 2 3 4 5 6 7 8 9 1
j 0
1 0
2 0.595833 0
3 0.154167 0.75 0
4 0.604167 0.0083334 0.75833 0
4 3
5 0.25 1 0.241667 0
CALCULATION OF OVERALL DISTANCE BETWEEN THE OBJECTS, CONSIDERING ALL THE MIXED TYPE
ATTRIBUTES SHOW ABOVE :-
For Example : Lets take i = 3 and j=9. Then the overall distance between objects i and j can be
calculated as :
i, 1 2 3 4 5 6 7 8 9 10
j
1 0
2 0.808861 0
3 0.387568 0.623377 0