Download as docx, pdf, or txt
Download as docx, pdf, or txt
You are on page 1of 13

K-Medoids Clustering Algorithm

Having an overview of K-Medoids clustering, let us discuss the algorithm for the
same.

1. First, we select K random data points from the dataset and use them as
medoids.
2. Now, we will calculate the distance of each data point from the medoids.
You can use any of the Euclidean, Manhattan distance, or squared
Euclidean distance as the distance measure.
3. Once we find the distance of each data point from the medoids, we will
assign the data points to the clusters associated with each medoid. The
data points are assigned to the medoids at the closest distance.
4. After determining the clusters, we will calculate the sum of the distance of
all the non-medoid data points to the medoid of each cluster. Let the cost
be Ci.
5. Now, we will select a random data point Dj from the dataset and swap it
with a medoid Mi. Here, Dj becomes a temporary medoid. After swapping,
we will calculate the distance of all the non-medoid data points to the
current medoid of each cluster. Let this cost be C j.
6. If Ci>Cj, the current medoids with Dj as one of the medoids are made
permanent medoids. Otherwise, we undo the swap, and M i is reinstated as
the medoid.
7. Repeat 4 to 6 until no change occurs in the clusters.

K-Medoids Clustering Numerical Example With


Solution
Now that we have discussed the algorithm, let us discuss a numerical example of
k-medoids clustering.

The dataset for clustering is as follows.

Point Coordinates

A1 (2, 6)
A2 (3, 8)

A3 (4, 7)

A4 (6, 2)

A5 (6, 4)

A6 (7, 3)

A7 (7,4)

A8 (8, 5)

A9 (7, 6)

A10 (3, 4)
Dataset For K-Medoids Clustering

Iteration 1
Suppose that we want to group the above dataset into two clusters. So, we will
randomly choose two medoids.

Here, the choice of medoids is important for efficient execution. Hence, we have
selected two points from the dataset that can be potential medoid for the final
clusters. Following are two points from the dataset that we have selected as
medoids.

 M1 = (3, 4)
 M2 = (7, 3)

Now, we will calculate the distance between each data point and the medoids
using the Manhattan distance measure. The results have been tabulated as
follows.

Point Coordinates Distance From M1 (3,4) Distance from M2 (7,3) Assigned

A1 (2, 6) 3 8 Cluster 1

A2 (3, 8) 4 9 Cluster 1

A3 (4, 7) 4 7 Cluster 1
A4 (6, 2) 5 2 Cluster 2

A5 (6, 4) 3 2 Cluster 2

A6 (7, 3) 5 0 Cluster 2

A7 (7,4) 4 1 Cluster 2

A8 (8, 5) 6 3 Cluster 2

A9 (7, 6) 6 3 Cluster 2

A10 (3, 4) 0 5 Cluster 1


Iteration 1
The clusters made with medoids (3, 4) and (7, 3) are as follows.

 Points in cluster1= {(2, 6), (3, 8), (4, 7), (3, 4)}
 Points in cluster 2= {(7,4), (6,2), (6, 4), (7,3), (8,5), (7,6)}

After assigning clusters, we will calculate the cost for each cluster and find their
sum. The cost is nothing but the sum of distances of all the data points from the
medoid of the cluster they belong to.

Hence, the cost for the current cluster will be 3+4+4+2+2+0+1+3+3+0=22.

Iteration 2
Now, we will select another non-medoid point (7, 4) and make it a temporary
medoid for the second cluster. Hence,

 M1 = (3, 4)
 M2 = (7, 4)

Now, let us calculate the distance between all the data points and the current
medoids.

Point Coordinates Distance From M1 (3,4) Distance from M2 (7,4) Assigned

A1 (2, 6) 3 7 Cluster 1

A2 (3, 8) 4 8 Cluster 1
A3 (4, 7) 4 6 Cluster 1

A4 (6, 2) 5 3 Cluster 2

A5 (6, 4) 3 1 Cluster 2

A6 (7, 3) 5 1 Cluster 2

A7 (7,4) 4 0 Cluster 2

A8 (8, 5) 6 2 Cluster 2

A9 (7, 6) 6 2 Cluster 2

A10 (3, 4) 0 4 Cluster 1


Iteration 2
The data points haven’t changed in the clusters after changing the medoids.
Hence, clusters are:

 Points in cluster1:{(2, 6), (3, 8), (4, 7), (3, 4)}


 Points in cluster 2:{(7,4), (6,2), (6, 4), (7,3), (8,5), (7,6)}

Now, let us again calculate the cost for each cluster and find their sum. The total
cost this time will be 3+4+4+3+1+1+0+2+2+0=20.

Here, the current cost is less than the cost calculated in the previous iteration.
Hence, we will make the swap permanent and make (7,4) the medoid for cluster
2. If the cost this time was greater than the previous cost i.e. 22, we would have
to revert the change. New medoids after this iteration are (3, 4) and (7, 4) with no
change in the clusters.

Iteration 3
Now, let us again change the medoid of cluster 2 to (6, 4). Hence, the new
medoids for the clusters are M1=(3, 4) and M2= (6, 4 ).

Let us calculate the distance between the data points and the above medoids to
find the new cluster. The results have been tabulated as follows.

Point Coordinates Distance From M1 (3,4) Distance from M2 (6,4) Assigned


A1 (2, 6) 3 6 Cluster 1

A2 (3, 8) 4 7 Cluster 1

A3 (4, 7) 4 5 Cluster 1

A4 (6, 2) 5 2 Cluster 2

A5 (6, 4) 3 0 Cluster 2

A6 (7, 3) 5 2 Cluster 2

A7 (7,4) 4 1 Cluster 2

A8 (8, 5) 6 3 Cluster 2

A9 (7, 6) 6 3 Cluster 2

A10 (3, 4) 0 3 Cluster 1


Iteration 3
Again, the clusters haven’t changed. Hence, clusters are:

 Points in cluster1:{(2, 6), (3, 8), (4, 7), (3, 4)}


 Points in cluster 2:{(7,4), (6,2), (6, 4), (7,3), (8,5), (7,6)}

Now, let us again calculate the cost for each cluster and find their sum. The total
cost this time will be 3+4+4+2+0+2+1+3+3+0=22.

The current cost is 22 which is greater than the cost in the previous iteration i.e.
20. Hence, we will revert the change and the point (7, 4) will again be made the
medoid for cluster 2.

So, the clusters after this iteration will be cluster1 = {(2, 6), (3, 8), (4, 7), (3, 4)} and
cluster 2= {(7,4), (6,2), (6, 4), (7,3), (8,5), (7,6)}. The medoids are (3,4) and (7,4).

We keep replacing the medoids with a non-medoid data point. The set of
medoids for which the cost is the least, the medoids, and the associated clusters
are made permanent. So, after all the iterations, you will get the final clusters and
their medoids.

The K-Medoids clustering algorithm is a computation-intensive algorithm that


requires many iterations. In each iteration, we need to calculate the distance
between the medoids and the data points, assign clusters, and compute the cost.
Hence, K-Medoids clustering is not well suited for large data sets.

Different Implementations of K-Medoids


Clustering
K-Medoids clustering algorithm has different implementations such as PAM,
CLARA, and CLARANS. Let us discuss their properties one by one.

PAM Implementation
The PAM implementation of the K-Medoids clustering algorithm is the most basic
implementation. In the numerical example on K-medoids clustering discussed
above, we have discussed the PAM implementation.

The PAM implementation performs iterative clustering on the entire dataset and
is computationally intensive. Hence, it is not very efficient for large datasets.

CLARA Implementation
CLARA implementation of k-medoids clustering is an improvement over the PAM
implementation.

 In CLARA implementation, the algorithm first selects samples of data points


from the data. Then, it performs PAM clustering on the samples. After
applying PAM on the samples, CLARA gives output the best clusters from
the sampled outputs.
 CLARA performs clustering on samples of a given dataset. Hence, we can
use it to perform clustering on large datasets too. With increasing dataset
size, the effectiveness of the CLARA implementation of k-medoids
clustering also increases. For small datasets, CLARA doesn’t work efficiently.
Hence, you should use it for large datasets for better performance.
 In the case of unbalanced datasets, if the samples are also biased, then the
results of the clustering algorithm won’t be good. Hence, you need to take
measures during data preprocessing to make the data as balanced as
possible.
CLARANS Implementation
CLARANS implementation of the k-medoids clustering algorithm is the most
efficient implementation.

 While clustering, CLARANS chooses a sample of neighboring data points.


Due to this, it doesn’t restrict the sampling on a particular area and gives
the best results based on all the samples.
 CLARANS uses the neighbor data points at each step. Hence, for a large
dataset having a large number of clusters. The time of execution of
CLARANS implementation of k-medoids clustering becomes high.
 We can define the number of local optimums while clustering using
CLARANS. IN PAM or CLARA, the algorithm runs for all data points in the
cluster to find a local optimum by randomly picking another data point
once it finds a local optimum. We can limit the number of times this
process is repeated in CLARANS.

Applications of K-Medoids Clustering in


Machine Learning
K-Medoids clustering is an alternative to the K-means clustering algorithm.
Hence, you can use k-medoids clustering wherever k-means is applied.

 You can use k-medoids clustering to document classification. The


documents are first converted to vectors called word embeddings. Then,
you can use k-medoids clustering to group documents into clusters using
the word embeddings of the documents.
 E-commerce websites, supermarkets, and malls use transaction data for
customer segmentation based on buying behavior. It helps them to
increase sales by recommending products according to user behavior
using recommendation systems.
 K-Medoids clustering can also be used in cyber profiling. It involves
grouping people based on their connection to each other. By collecting
usage data from individual users, you can cluster them into groups for
profiling.
 By grouping similar pixels into clusters, you can also perform image
segmentation using k-medoids clustering.
 You can also perform fraud detection on banking and insurance data using
k-medoids clustering. By using historical data on frauds, you can find
probable fraudulent transactions from a given dataset.
 You can also use k-medoids clustering in various other tasks such as social
media profiling, ride-share data analysis, identification of localities with a
high rate of crime, etc.

Advantages of K-Medoids Clustering


 K-Medoids clustering is a simple iterative algorithm and is very easy to
implement.

 K-Medoids clustering is guaranteed to converge. Hence, we are guaranteed


to get results when we perform k-medoids clustering on any dataset.
 K-Medoids clustering doesn’t apply to a specific domain. Owing to the
generalization, we can use k-medoids clustering in different machine
learning applications ranging from text data to Geo-spatial data and
financial data to e-commerce data.
 The medoids in k-medoids clustering are selected randomly. We can
choose the initial medoids in a way such that the number of iterations can
be minimized. For improving the performance of the algorithm, you can
warm start the choice of medoids by selecting specific data points as
medoids after data analysis.
 Compared to other partitioning clustering algorithms such as K-median
and k-modes clustering, the k-medoids clustering algorithm is faster in
execution.
 K-Medoids clustering algorithm is very robust and it effectively deals with
outliers and noise in the dataset. Compared to k-means clustering, the k-
medoids clustering algorithm is a better choice for analyzing data with
significant noise and outliers.

Disadvantages of K-Medoids Clustering


 K-Medoids clustering is not a scalable algorithm. For large datasets, the
execution time of the algorithm becomes very high. Due to this, you
cannot use the K-Medoids algorithm for very large datasets.
 In K-Medoids clustering, we don’t know the optimal number of clusters.
Hence, we need to perform clustering using different values of k to find the
optimal number of clusters.
 When the dimensionality in the dataset increases, the distance between the
data points starts to become similar. Due to this, the distance between a
data point and various medoids becomes almost the same. This introduces
inefficiency in the execution. To overcome this problem, you can use
advanced clustering algorithms like spectral clustering. Alternatively, you
can also try to reduce the dimensionality of the dataset while data
preprocessing.
 K-Medoids clustering algorithm chooses the initial medoids in a random
manner. Also, the output clusters are primarily dependent on the initial
medoids. Due to this, every time you run the k-medoids clustering
algorithm, you will get unique clusters.
 K-Medoids clustering algorithm uses the distance from a central point
(medoid). Due to this, the k-medoids algorithm gives circular/spherical
clusters. This algorithm is not useful for clustering data into arbitrarily-
shaped clusters.

Conclusion
In this article, we have discussed k-medoids clustering with a numerical example.
We have also briefly discussed different implementations of k-medoids
clustering, it applications, advantages, and disadvantages.

To learn more about machine learning, you can read this article on regression in
machine learning. You might also like this article on polynomial regression using
sklearn in python.

K-Medoids (also called Partitioning Around Medoid) algorithm was proposed in


1987 by Kaufman and Rousseeuw. A medoid can be defined as a point in the
cluster, whose dissimilarities with all the other points in the cluster are
minimum. The dissimilarity of the medoid(Ci) and object(Pi) is calculated by
using E = |Pi – Ci|
The cost in K-Medoids algorithm is given as
Algorithm:
1. Initialize: select k random points out of the n data points as the medoids.
2. Associate each data point to the closest medoid by using any common
distance metric methods.
3. While the cost decreases: For each medoid m, for each data o point which is
not a medoid:
 Swap m and o, associate each data point to the closest medoid, and
recompute the cost.
 If the total cost is more than that in the previous step, undo the swap.

Let’s consider the following example: If a graph is drawn using the above data
points, we obtain the following:
Step 1: Let the randomly selected 2 medoids, so select k = 2, and let C1 -(4,
5) and C2 -(8, 5) are the two medoids.
Step 2: Calculating cost. The dissimilarity of each non-medoid point with the
medoids is calculated and tabulated:

Here we have used Manhattan distance formula to calculate the distance


matrices between medoid and non-medoid points. That formula tell that
Distance = |X1-X2| + |Y1-Y2|.
Each point is assigned to the cluster of that medoid whose dissimilarity is less.
Points 1, 2, and 5 go to cluster C1 and 0, 3, 6, 7, 8 go to cluster C2. The Cost =
(3 + 4 + 4) + (3 + 1 + 1 + 2 + 2) = 20
Step 3: randomly select one non-medoid point and recalculate the
cost. Let the randomly selected point be (8, 4). The dissimilarity of each non-
medoid point with the medoids – C1 (4, 5) and C2 (8, 4) is calculated and
tabulated.

Each point is assigned to that cluster whose dissimilarity is less. So, points 1, 2,
and 5 go to cluster C1 and 0, 3, 6, 7, 8 go to cluster C2. The New cost = (3 + 4
+ 4) + (2 + 2 + 1 + 3 + 3) = 22 Swap Cost = New Cost – Previous Cost = 22 –
20 and 2 >0 As the swap cost is not less than zero, we undo the swap. Hence
(4, 5) and (8, 5) are the final medoids. The clustering would be in the following
way The time complexity is .
Advantages:
1. It is simple to understand and easy to implement.
2. K-Medoid Algorithm is fast and converges in a fixed number of steps.
3. PAM is less sensitive to outliers than other partitioning algorithms.
Disadvantages:
1. The main disadvantage of K-Medoid algorithms is that it is not suitable for
clustering non-spherical (arbitrarily shaped) groups of objects. This is
because it relies on minimizing the distances between the non-medoid
objects and the medoid (the cluster center) – briefly, it uses compactness as
clustering criteria instead of connectivity.
2. It may obtain different results for different runs on the same dataset because
the first k medoids are chosen randomly.
Here's a complete roadmap for you to become a developer: Learn DSA ->
Master Frontend/Backend/Full Stack -> Build Projects -> Keep Applying to
Jobs

And why go anywhere else when our DSA to Development: Coding Guide helps
you do this in a single program! Apply now to our DSA to Development
Program and our counsellors will connect with you for further guidance &
support.

You might also like