Welcome to Scribd!

Balanced K-Means Revisited-6

Uploaded by

0% found this document useful (0 votes)

3 views2 pages

This document compares the performance of different clustering algorithms on various datasets. It finds that the BKM+ algorithm, which uses a variable balance constraint, improves upon standard k-means by reaching the correct clustering for datasets that k-means fails on. When applied to a large dataset of house coordinates in Finland (the Santa dataset), BKM+ finds an approximate clustering faster and using less memory than the RKM algorithm, though some errors remain at cluster borders. These errors are further reduced using a post-processing step with the BKM++ algorithm.

Original Description:

Original Title

Balanced k-means revisited-6

Copyright

Available Formats

PDF, TXT or read online from Scribd

Share this document

Share or Embed Document

Sharing Options

Did you find this document useful?

Is this content inappropriate?

Report this Document

Copyright:

Available Formats

Download as PDF, TXT or read online from Scribd

Flag for inappropriate content

Download as pdf or txt

0% found this document useful (0 votes)

3 views2 pages

Balanced K-Means Revisited-6

Uploaded by

jefferyleclerc

Copyright:

Available Formats

Download as PDF, TXT or read online from Scribd

Flag for inappropriate content

Download as pdf or txt

Jump to Page

You are on page 1of 2

Search inside document

165

Dataset (n,d,k) Algo SSE NMI Time (s)

features RKM 1.760e+06 0.731 0.19
(2000, 240, 10) BKM 1.751e+06 0.747 936.93
BKM+ 1.760e+06 0.707 0.53
BKM++ 1.759e+06 0.708 0.56

santa RKM 3.683e+15 - 1612

(1440k, 2, 20) BKM - -
BKM+ 3.745e+15 - 310
BKM++ 3.653e+15 - 367

The results of BKM+ are visualized for selected datasets in Figures 3, 4 and 5. Our first observation
is that the use of variable balance constraint actually improves results also compared to standard k-means.
The datasets A3 and B1 cannot be successfully clustered by k-means even if repeated 100 times or using
better initialization (Maxmin or k-means++) [11]. However, BKM+ can reach the correct clustering
because the balance constraint makes the algorithm a stochastic variant of k-means, which helps to
overcome local minima.
The two most demanding datasets for BKM+ are Santa [44] and Unbalanced; both are characterized
by variable-size clusters. The Santa dataset (see Figure 4 and Table 2) contains 1.4 million points
representing house coordinates in Finland. BKM+ algorithm finds approximately the correct clustering
but there are minor degradation at the cluster borders (SSE 3.745e+15). These errors are fixed by using
the post-processing algorithm BKM++ (SSE 3.653e+15) which yields also slightly better results than
RKM (SSE 3.683e+15).
For the Santa dataset (Figure 4), the proposed method has better time and memory efficiency
compared to RKM. It took only 19% of the processing time required by RKM, and the memory footprint
was only 8% compared to RKM (125MB vs. 1,403MB).
BKM++ could significantly (>0.5%) improve the SSE results of BKM+ in case of 8 of the 19
datasets. BKM++ results were almost identical to RKM results. There were only three cases where
SSE result of RKM differed more than 0.1% compared to BKM++. For two of these cases BKM++
had a smaller SSE (santa -0.8%, libra -1.2%, S2 +0.1%).

Applied Computing and Intelligence Volume 3, Issue 2, 145–179

166

Figure 4. Results for the santa dataset. Best SSE out of 100 repeats.

Applied Computing and Intelligence Volume 3, Issue 2, 145–179

Spreadsheet Modeling and Decision Analysis A Practical Introduction To Business Analytics 7th Edition Cliff Ragsdale Solutions Manual
Document14 pages
Spreadsheet Modeling and Decision Analysis A Practical Introduction To Business Analytics 7th Edition Cliff Ragsdale Solutions Manual
ShawnMitchellbfqyz
92% (12)
Soft Fund
Document135 pages
Soft Fund
sruthimannam28
50% (2)
Paging Load Calculation
Document2 pages
Paging Load Calculation
Rao Dheeru
No ratings yet
Fox, Goose, Grain Artificial Intelligence Problem
Document10 pages
Fox, Goose, Grain Artificial Intelligence Problem
balwinder
100% (7)
Balanced K-Means Revisited-5
Document3 pages
Balanced K-Means Revisited-5
jefferyleclerc
No ratings yet
All All '2.jpg': For For For End
Document7 pages
All All '2.jpg': For For For End
Rohan Rakesh Kumar Sharma 16BCE0454
No ratings yet
High Speed Power Efficient Carry Select Adder Design: 2159-3477/17 $31.00 © 2017 IEEE DOI 10.1109/ISVLSI.2017.16 32
Document6 pages
High Speed Power Efficient Carry Select Adder Design: 2159-3477/17 $31.00 © 2017 IEEE DOI 10.1109/ISVLSI.2017.16 32
rc1306
No ratings yet
Slow Running SQL Degrade Oracle Performance
Document17 pages
Slow Running SQL Degrade Oracle Performance
patrick_wangrui
100% (10)
Ottankadu - Punalvasal - Safe
Document6 pages
Ottankadu - Punalvasal - Safe
Renugopal
No ratings yet
Topic 11 - Empirical Evidence
Document18 pages
Topic 11 - Empirical Evidence
Dweep Kapadia
No ratings yet
83603645-Ericsson-RBS-6000-Sync-Alarm (Enregistré Automatiquement)
Document5 pages
83603645-Ericsson-RBS-6000-Sync-Alarm (Enregistré Automatiquement)
Wissem Ben Attia
No ratings yet
Georglm: A Package For Generalised Linear Spatial Models Introductory Session
Document10 pages
Georglm: A Package For Generalised Linear Spatial Models Introductory Session
Alvaro Ernesto Garzon Garzon
No ratings yet
CMOS Binary Full Adder
Document20 pages
CMOS Binary Full Adder
Hani Masoumi
No ratings yet
Package 4 Pavement Design
Document6 pages
Package 4 Pavement Design
Sumit Jain
No ratings yet
Low Power Efficient MAC Unit Using Proposed Carry Select Adder
Document6 pages
Low Power Efficient MAC Unit Using Proposed Carry Select Adder
sureshraja001
No ratings yet
Ex Ch1-1 (Answer)
Document20 pages
Ex Ch1-1 (Answer)
Jeremy Bibi
No ratings yet
Time Date Timestamp Interval
Document12 pages
Time Date Timestamp Interval
atul_katiyar85
No ratings yet
TimeSeries Report
Document41 pages
TimeSeries Report
Sukanya Manickavel
No ratings yet
Areeba Techno India University
Document1 page
Areeba Techno India University
Alex Danver
No ratings yet
PDF Solutions Manual To Accompany Miller Freunds Probability and Statistics For Engineers 8Th Edition 0321640772 Online Ebook Full Chapter
Document27 pages
PDF Solutions Manual To Accompany Miller Freunds Probability and Statistics For Engineers 8Th Edition 0321640772 Online Ebook Full Chapter
samuel.byrd868
100% (8)
Sayeed Techno India University
Document1 page
Sayeed Techno India University
Alex Danver
No ratings yet
Book Rate
Document7 pages
Book Rate
Claire Marshall
No ratings yet
Perkerasan Kaku
Document5 pages
Perkerasan Kaku
Teknik Sipil
No ratings yet
Elx305 Ref Exam Solutions
Document7 pages
Elx305 Ref Exam Solutions
Nadeesha Bandara
100% (1)
Dates Timestamp Operations in Teradata
Document8 pages
Dates Timestamp Operations in Teradata
9767793694
No ratings yet
Bachelor of Technology in 'Computer Science and Engineering - (2020-21)
Document2 pages
Bachelor of Technology in 'Computer Science and Engineering - (2020-21)
Nishchay Verma
No ratings yet
f13_3300_e1.pdf
Document8 pages
f13_3300_e1.pdf
Mohamed
No ratings yet
ARIMA Predict Forecast
Document1 page
ARIMA Predict Forecast
Manoj M
No ratings yet
CH 9
Document8 pages
CH 9
Maulik Chaudhary
No ratings yet
SQL - Numeric Functions - GeeksforGeeks
Document7 pages
SQL - Numeric Functions - GeeksforGeeks
mohan krishna
No ratings yet
Korelasi: Abdillah Khoirul Amar 2024-03-08
Document10 pages
Korelasi: Abdillah Khoirul Amar 2024-03-08
abdillah.k.a178
No ratings yet
Implementation of Area Efficient Carry-Select Adder: K. R P, V. N R
Document6 pages
Implementation of Area Efficient Carry-Select Adder: K. R P, V. N R
vta
No ratings yet
18cs733 2021#2 KJ RP SK Scheme
Document3 pages
18cs733 2021#2 KJ RP SK Scheme
Rajeshwari R P
No ratings yet
Eigrp Metric
Document4 pages
Eigrp Metric
Anonymous qNY1Xcw7R
No ratings yet
JVM Memory Management & Diagnostics
Document24 pages
JVM Memory Management & Diagnostics
Kshirabdhi Tanaya
No ratings yet
GSM BSS Network KPI (Call Setup Success Rate) Optimization Manual V1.0
Document17 pages
GSM BSS Network KPI (Call Setup Success Rate) Optimization Manual V1.0
zaidihussain14
No ratings yet
27.2Kbps High-Speed Signaling RB: Telenor Pakistan
Document7 pages
27.2Kbps High-Speed Signaling RB: Telenor Pakistan
Qasim Abbas Alvi
100% (1)
CRC
Document83 pages
CRC
Hari Krishna
No ratings yet
Lecture Notes Workings
Document135 pages
Lecture Notes Workings
chandu
No ratings yet
S CS2 PaperB PDF
Document13 pages
S CS2 PaperB PDF
Ashuleena
No ratings yet
CS4961 L15
Document14 pages
CS4961 L15
Anirudh Seth
No ratings yet
A New Vlsi Architecture of Parallel Multiplier-Accumulator Based On Radix-2 Modified Booth Algorithm
Document6 pages
A New Vlsi Architecture of Parallel Multiplier-Accumulator Based On Radix-2 Modified Booth Algorithm
praneshece09
No ratings yet
Solution Manual For Spreadsheet Modeling and Decision Analysis A Practical Introduction To Business Analytics 7Th Edition Cliff Ragsdale 1285418689 978128541868 Full Chapter PDF
Document36 pages
Solution Manual For Spreadsheet Modeling and Decision Analysis A Practical Introduction To Business Analytics 7Th Edition Cliff Ragsdale 1285418689 978128541868 Full Chapter PDF
russell.humphrey110
100% (15)
2002 University/College IC Contest Cell-Based IC For Graduate Level - Final Stage
Document14 pages
2002 University/College IC Contest Cell-Based IC For Graduate Level - Final Stage
E94086018陳寧文
No ratings yet
Low Delay Based QSD Multiplier
Document6 pages
Low Delay Based QSD Multiplier
International Journal of Innovative Science and Research Technology
100% (2)
Assembly Line Balancing Methods-A Case Study: Vrittika V Pachghare, R. S. Dalu
Document5 pages
Assembly Line Balancing Methods-A Case Study: Vrittika V Pachghare, R. S. Dalu
Kenan Muhamedagic
No ratings yet
Rajasthan Technical University, Kota Teaching & Scheme of Examination For B.Tech. (Computer Engineering)
Document4 pages
Rajasthan Technical University, Kota Teaching & Scheme of Examination For B.Tech. (Computer Engineering)
MUKESH
No ratings yet
Error Pentium
Document25 pages
Error Pentium
Yerson Zender
No ratings yet
Poovalur - Neiveli - Safe
Document6 pages
Poovalur - Neiveli - Safe
Renugopal
No ratings yet
Period Start Time PLMN Name: Voice Call Setup SR (Rrc+Cu) CSSR Ps NRT Rrcblockingcongestioncs
Document44 pages
Period Start Time PLMN Name: Voice Call Setup SR (Rrc+Cu) CSSR Ps NRT Rrcblockingcongestioncs
denoo william
No ratings yet
Semester Project: Automotive Suspension System
Document14 pages
Semester Project: Automotive Suspension System
Ammar Ahmed
No ratings yet
Data Transformation
Document12 pages
Data Transformation
Cherry
No ratings yet
Isaa Lab Da5
Document14 pages
Isaa Lab Da5
K 9
No ratings yet
Constraints Sta PDF
Document92 pages
Constraints Sta PDF
Pudi Sriharsha
No ratings yet
ESO207 - Assignment 1 Tanush Kumar (201043) Kartikeya Raghuvanshi (200494)
Document6 pages
ESO207 - Assignment 1 Tanush Kumar (201043) Kartikeya Raghuvanshi (200494)
Tanush
No ratings yet
Architectures For Mixcolumn Transform For The Aes: P. Noo-Intara, S. Chantarawong, and S. Choomchuay
Document5 pages
Architectures For Mixcolumn Transform For The Aes: P. Noo-Intara, S. Chantarawong, and S. Choomchuay
Ngọc Lợi
No ratings yet
Design NH-75 Ext
Document2 pages
Design NH-75 Ext
GOVIND KUMAR
No ratings yet
A Dynamic Search Range Algorithm For H.264/AVC Full-Search Motion Estimation
Document4 pages
A Dynamic Search Range Algorithm For H.264/AVC Full-Search Motion Estimation
Nathan Love
No ratings yet
Data-Variant Kernel Analysis
From Everand
Data-Variant Kernel Analysis
Yuichi Motai
No ratings yet
Space Modulation Techniques
From Everand
Space Modulation Techniques
Raed Mesleh
No ratings yet
Metaheuristics for String Problems in Bio-informatics
From Everand
Metaheuristics for String Problems in Bio-informatics
Christian Blum
No ratings yet
Evolutionary Algorithms for Mobile Ad Hoc Networks
From Everand
Evolutionary Algorithms for Mobile Ad Hoc Networks
Bernabé Dorronsoro
No ratings yet
2023 Data, Analytics, and Artificial Intelligence Adoption Strategy-H
Document4 pages
2023 Data, Analytics, and Artificial Intelligence Adoption Strategy-H
jefferyleclerc
No ratings yet
2023 Data, Analytics, and Artificial Intelligence Adoption Strategy-3
Document3 pages
2023 Data, Analytics, and Artificial Intelligence Adoption Strategy-3
jefferyleclerc
No ratings yet
2023 Data, Analytics, and Artificial Intelligence Adoption Strategy-C
Document10 pages
2023 Data, Analytics, and Artificial Intelligence Adoption Strategy-C
jefferyleclerc
No ratings yet
2023 Data, Analytics, and Artificial Intelligence Adoption Strategy-A
Document7 pages
2023 Data, Analytics, and Artificial Intelligence Adoption Strategy-A
jefferyleclerc
No ratings yet
2 Mapreduce Model Principles
Document7 pages
2 Mapreduce Model Principles
jefferyleclerc
No ratings yet
MapReduce - What It Is, and Why It Is So Popular
Document7 pages
MapReduce - What It Is, and Why It Is So Popular
jefferyleclerc
No ratings yet
SAP HANA PAL - K-Means Algorithm or How To Do Cust... - SAP Community-1E
Document2 pages
SAP HANA PAL - K-Means Algorithm or How To Do Cust... - SAP Community-1E
jefferyleclerc
No ratings yet
Hadoop
Document7 pages
Hadoop
jefferyleclerc
No ratings yet
Paper Dvi
Document7 pages
Paper Dvi
jefferyleclerc
No ratings yet
Balanced K-Means Revisited-5
Document3 pages
Balanced K-Means Revisited-5
jefferyleclerc
No ratings yet
SAP HANA PAL - K-Means Algorithm or How To Do Cust... - SAP Community-1Q
Document2 pages
SAP HANA PAL - K-Means Algorithm or How To Do Cust... - SAP Community-1Q
jefferyleclerc
No ratings yet
Balanced K-Means Revisited-1
Document3 pages
Balanced K-Means Revisited-1
jefferyleclerc
No ratings yet
SAP HANA PAL - K-Means Algorithm or How To Do Cust... - SAP Community-9
Document4 pages
SAP HANA PAL - K-Means Algorithm or How To Do Cust... - SAP Community-9
jefferyleclerc
No ratings yet
SAP HANA PAL - K-Means Algorithm or How To Do Cust... - SAP Community-17
Document3 pages
SAP HANA PAL - K-Means Algorithm or How To Do Cust... - SAP Community-17
jefferyleclerc
No ratings yet
SAP HANA PAL - K-Means Algorithm or How To Do Cust... - SAP Community-O
Document3 pages
SAP HANA PAL - K-Means Algorithm or How To Do Cust... - SAP Community-O
jefferyleclerc
No ratings yet
A Distance-Based Kernel For Classification Via Support Vector Machines - PMC-17
Document1 page
A Distance-Based Kernel For Classification Via Support Vector Machines - PMC-17
jefferyleclerc
No ratings yet
SAP HANA PAL - K-Means Algorithm or How To Do Cust... - SAP Community-16
Document3 pages
SAP HANA PAL - K-Means Algorithm or How To Do Cust... - SAP Community-16
jefferyleclerc
No ratings yet
SAP HANA PAL - K-Means Algorithm or How To Do Cust... - SAP Community-14
Document3 pages
SAP HANA PAL - K-Means Algorithm or How To Do Cust... - SAP Community-14
jefferyleclerc
No ratings yet
Data Visualization Cheat Sheet For Basic Machine Learning Algorithms - by Boriharn K - Mar, 2024 - Towards Data Science
Document3 pages
Data Visualization Cheat Sheet For Basic Machine Learning Algorithms - by Boriharn K - Mar, 2024 - Towards Data Science
jefferyleclerc
No ratings yet
SAP HANA PAL - K-Means Algorithm or How To Do Cust... - SAP Community-P
Document3 pages
SAP HANA PAL - K-Means Algorithm or How To Do Cust... - SAP Community-P
jefferyleclerc
No ratings yet
SAP HANA PAL - K-Means Algorithm or How To Do Cust... - SAP Community-A
Document6 pages
SAP HANA PAL - K-Means Algorithm or How To Do Cust... - SAP Community-A
jefferyleclerc
No ratings yet
SAP HANA PAL - K-Means Algorithm or How To Do Cust... - SAP Community
Document3 pages
SAP HANA PAL - K-Means Algorithm or How To Do Cust... - SAP Community
jefferyleclerc
No ratings yet
Tutorial For K Means Clustering in Python Sklearn - MLK - Machine Learning Knowledge-5
Document3 pages
Tutorial For K Means Clustering in Python Sklearn - MLK - Machine Learning Knowledge-5
jefferyleclerc
No ratings yet
SAP HANA PAL - K-Means Algorithm or How To Do Cust... - SAP Community-4
Document3 pages
SAP HANA PAL - K-Means Algorithm or How To Do Cust... - SAP Community-4
jefferyleclerc
No ratings yet
SAP HANA PAL - K-Means Algorithm or How To Do Cust... - SAP Community-5
Document4 pages
SAP HANA PAL - K-Means Algorithm or How To Do Cust... - SAP Community-5
jefferyleclerc
No ratings yet
Improved K-Means Map Reduce Algorithm For Big Data Cluster Analysis
Document7 pages
Improved K-Means Map Reduce Algorithm For Big Data Cluster Analysis
jefferyleclerc
No ratings yet
K-Means Clustering Optimization Algorithm Based On Mapreduce
Document6 pages
K-Means Clustering Optimization Algorithm Based On Mapreduce
jefferyleclerc
No ratings yet
Fast Scalable K-Means++ Algorithm With Mapreduce
Document2 pages
Fast Scalable K-Means++ Algorithm With Mapreduce
jefferyleclerc
No ratings yet
The Incremental Online K Means Clustering Algorithm and Its Application To Color Quantization
Document42 pages
The Incremental Online K Means Clustering Algorithm and Its Application To Color Quantization
jefferyleclerc
No ratings yet
Analysis of Mapreduce Algorithms: Harini Padmanaban
Document6 pages
Analysis of Mapreduce Algorithms: Harini Padmanaban
jefferyleclerc
No ratings yet
Brochure Master in Cyber Security
Document24 pages
Brochure Master in Cyber Security
robaks
No ratings yet
Image Compression: Sankalp Kallakuri
Document21 pages
Image Compression: Sankalp Kallakuri
Sankalp_Kallakur_402
No ratings yet
Third Year Sixth Semester CS6660 Compiler Design Two Mark With Answer
Document11 pages
Third Year Sixth Semester CS6660 Compiler Design Two Mark With Answer
PRIYA RAJI
100% (1)
Controlling The Solution of Linear Systems - Fenicsproject
Document6 pages
Controlling The Solution of Linear Systems - Fenicsproject
Xu Moran
No ratings yet
Data Structures (OE-1) Syllabus
Document2 pages
Data Structures (OE-1) Syllabus
teludamodarjntu
No ratings yet
Dlvu Lecture12
Document19 pages
Dlvu Lecture12
Awatef Messaoudi
No ratings yet
Linear Queue Implementation Array: Array Items (Size) Front Rear
Document5 pages
Linear Queue Implementation Array: Array Items (Size) Front Rear
AhmaadFahimm
No ratings yet
Fruit Recognition System and Its Nutrition Level
Document12 pages
Fruit Recognition System and Its Nutrition Level
Birinchi Medhi
No ratings yet
Strassen's Matrix Multiplication
Document12 pages
Strassen's Matrix Multiplication
Vijay Trivedi
No ratings yet
CSI 411 - Compiler - Lecture 2 PDF
Document22 pages
CSI 411 - Compiler - Lecture 2 PDF
Md Tasriful Hoque
No ratings yet
DibyasomPuhan - DAA Assignment 2
Document14 pages
DibyasomPuhan - DAA Assignment 2
RAHUL DHANOLA
No ratings yet
MAD101 Chap1
Document220 pages
MAD101 Chap1
hung le
No ratings yet
Final Sets
Document13 pages
Final Sets
Dasaradha Rami Reddy V
No ratings yet
S2 Big Data Week 5 Quiz - Attempt Review
Document9 pages
S2 Big Data Week 5 Quiz - Attempt Review
cici
No ratings yet
PIC16F887
Document17 pages
PIC16F887
Triple Vi
No ratings yet
5 2 Algoritam Ucenja Graficki BP
Document23 pages
5 2 Algoritam Ucenja Graficki BP
Nedim Hadžiaganović
No ratings yet
CS2040S CheatSheet
Document2 pages
CS2040S CheatSheet
Peter Joseph Fung
No ratings yet
Compiler: Mahmudul Hasan (Moon)
Document4 pages
Compiler: Mahmudul Hasan (Moon)
Mahmudul Hasan
No ratings yet
Mock Final Exam (p1)
Document4 pages
Mock Final Exam (p1)
Zhansaya Zhassulanova
No ratings yet
Outline of AI
Document3 pages
Outline of AI
Omair Gull
No ratings yet
A Hybrid Algorithm For Vehicle Routing Problem With Time Windows and Target Time
Document10 pages
A Hybrid Algorithm For Vehicle Routing Problem With Time Windows and Target Time
Karim EL Bouyahyiouy
No ratings yet
ICSE Class 10 Computer Applications (Java) 2012 Solved Question Paper
Document64 pages
ICSE Class 10 Computer Applications (Java) 2012 Solved Question Paper
Kaushik Choudhury
No ratings yet
Ada Theory
Document10 pages
Ada Theory
Vikram Rao
No ratings yet
Proportional Logic and First Order Logic
Document41 pages
Proportional Logic and First Order Logic
Raghavendra Sukumaran
No ratings yet
Lecture 2
Document25 pages
Lecture 2
Abhishek
No ratings yet
Compilef Design Unit 2 AKTU As Per 2023-24 Syllabus
Document46 pages
Compilef Design Unit 2 AKTU As Per 2023-24 Syllabus
NoidaTut E-Learn
No ratings yet
ISO/TC 46/SC 9/working Group 1
Document4 pages
ISO/TC 46/SC 9/working Group 1
João Paulo César
No ratings yet
Artificial Intelligence
Document11 pages
Artificial Intelligence
ABHISHEK KUMAR
No ratings yet