Professional Documents
Culture Documents
Balanced K-Means Revisited-7
Balanced K-Means Revisited-7
3(2): 145–179
DOI: 10.3934/aci.2023008
Received: 06 October 2023
Accepted: 17 October 2023
https://www.aimspress.com/journal/aci
Published: 27 October 2023
Research article
Balanced k-means revisited
1. Introduction
The clustering problem is to partition objects into separate groups (called clusters) so that objects
within one cluster are more similar to each other than objects in different clusters [1]. Since the middle
of the 20th century, thousands of algorithms addressing the clustering problem have been published [2].
One of the most popular clustering algorithms is the k-means algorithm [2, 3]. It was first proposed
by [4] and [5]. This clustering algorithm aims to build k disjoint clusters such that the sum of squared
distances between the data points and their representatives is minimized. The representatives, called
centroids, are determined by the mean of the data points belonging to a cluster. As a distance function,
the Euclidean distance is used. The number of clusters k has to be set by the user.
169
1e13
1.46 RKM 4
1.40
2
1.38
1.36 1
1.34
0
15.0 12.5 10.0 7.5 5.0 2.5 0.0 15.0 12.5 10.0 7.5 5.0 2.5 0.0
SDCS SDCS
1e13
1.46 4
Running time in sec
1.44
1.42 3
SSE
1.40
2
1.38
1.36 1
1.34
0
0.9997 0.9998 0.9999 1.0000 0.9997 0.9998 0.9999 1.0000
Normalized entropy Normalized entropy
1e13
1.46 4
Running time in sec
1.44
1.42 3
SSE
1.40
2
1.38
1.36 1
1.34
0
300 310 320 330 300 310 320 330
Minimum cluster size Minimum cluster size