Professional Documents
Culture Documents
Answers-Q3 1-Q3 2 PDF
Answers-Q3 1-Q3 2 PDF
Data samples: (0,0), (0,1), (1,0), (3,3), (5,6), (8,9), (9,8), (9,9)
Iteration 1
For Cluster-2: centroid of samples (3,3), (5,6), (8,9), (9,8), and (9,9) = (6.8,7)
Iteration 2
For Cluster-1: centroid of samples (0,0), (0,1), (1,0), and (3,3) = (1,1)
For Cluster-2: centroid of samples (5,6), (8,9), (9,8), and (9,9) = (7.75,8)
Iteration 3
For Cluster-1: centroid of samples (0,0), (0,1), (1,0), and (3,3) = (1,1)
For Cluster-2: centroid of samples (5,6), (8,9), (9,8), and (9,9) = (7.75,8)
WCSS = {(𝟎 − 𝟏)𝟐 + (𝟎 − 𝟏)𝟐 } + {(𝟎 − 𝟏)𝟐 + (𝟏 − 𝟏)𝟐 } + {(𝟏 − 𝟏)𝟐 + (𝟎 − 𝟏)𝟐 } +
{(𝟑 − 𝟏)𝟐 + (𝟑 − 𝟏)𝟐 } + {(𝟓 − 𝟕. 𝟕𝟓)𝟐 + (𝟔 − 𝟖)𝟐 } + {(𝟖 − 𝟕. 𝟕𝟓)𝟐 + (𝟗 − 𝟖)𝟐 } +
{(𝟗 − 𝟕. 𝟕𝟓)𝟐 + (𝟖 − 𝟖)𝟐 } + {(𝟗 − 𝟕. 𝟕𝟓)𝟐 + (𝟗 − 𝟖)𝟐 }
Subjecting the data points in Question 3.1 to K-medoids
Assumption: K=2, initial medoids are (0,0) & (5,6), and Manhattan distance is
used as a dissimilarity measure/metric.
What will be the cost if in Cluster-1, the medoid (0,0) is swapped with the non-
medoid data point (1,0)?
Cost = 1+2+0+5+0+6+6+7 = 27. The cost increases and therefore, the swap should
be avoided.
Will there be a decrease in the cost if in Cluster-2, the medoid (5,6) is swapped
with the non-medoid data point (8,9)?
Cost = 0+1+1+6+6+0+2+1 = 17. The cost decreases and therefore, they can be
swapped. The medoid for Cluster-2 should be (8,9).
Since no other swap could reduce the cost below 17, the final clusters can be
formed based on the last swap. The final clusters will be
ITERATION 1
STEP 1.1
P1 P2 P3 P4 P5 P6
P1 1
P2 0.7895 1
P3 0.1579 0.3684 1
P4 0.0100 0.2105 0.8421 1
P5 0.5292 0.7023 0.5292 0.3840 1
P6 0.3542 0.5480 0.6870 0.5573 0.8105 1
STEP 1.2
P1 P2 P34 P5 P6
P1 1
P2 0.7895 1
P34 0.1579 0.3684 1
P5 0.5292 0.7023 0.5292 1
P6 0.3542 0.5480 0.6870 0.8105 1
ITERATION 2
STEP 2.1
P1 P2 P34 P5 P6
P1 1
P2 0.7895 1
P34 0.1579 0.3684 1
P5 0.5292 0.7023 0.5292 1
P6 0.3542 0.5480 0.6870 0.8105 1
STEP 2.2
P1 P2 P34 P56
P1 1
P2 0.7895 1
P34 0.1579 0.3684 1
P56 0.5292 0.7023 0.6870 1
ITERATION 3
STEP 3.1
P1 P2 P34 P56
P1 1
P2 0.7895 1
P34 0.1579 0.3684 1
P56 0.5292 0.7023 0.6870 1
STEP 3.2
STEP 4.1
The next level (i.e. the top most level) of hierarchy will have P1256 and P34.
Resulting dendrogram
Answer for part-(b): complete-linkage clustering
ITERATION 1
STEP 1.1
P1 P2 P3 P4 P5 P6
P1 1
P2 0.7895 1
P3 0.1579 0.3684 1
P4 0.0100 0.2105 0.8421 1
P5 0.5292 0.7023 0.5292 0.3840 1
P6 0.3542 0.5480 0.6870 0.5573 0.8105 1
STEP 1.2
P1 P2 P34 P5 P6
P1 1
P2 0.7895 1
P34 0.0100 0.2105 1
P5 0.5292 0.7023 0.3840 1
P6 0.3542 0.5480 0.5573 0.8105 1
ITERATION 2
STEP 2.1
P1 P2 P34 P5 P6
P1 1
P2 0.7895 1
P34 0.0100 0.2105 1
P5 0.5292 0.7023 0.3840 1
P6 0.3542 0.5480 0.5573 0.8105 1
STEP 2.2
P1 P2 P34 P56
P1 1
P2 0.7895 1
P34 0.0100 0.2105 1
P56 0.3542 0.5480 0.3840 1
ITERATION 3
STEP 3.1
P1 P2 P34 P56
P1 1
P2 0.7895 1
P34 0.0100 0.2105 1
P56 0.3542 0.5480 0.3840 1
STEP 3.2
ITERATION 4
STEP 4.1
The next level (i.e. the top most level) of hierarchy will have P3456 and P12.
Resulting dendrogram