Professional Documents
Culture Documents
Tutorial 6 - Clustering
Tutorial 6 - Clustering
Copyright © 2014 EMC Corporation. All Rights Reserved. Module 4: Analytics Theory/Methods 1
Equations (two dimensions)
■ WSS is the sum of the squares of the distances between each data point and
the closest centroid. The term q(i) indicates the closest centroid that is
associated with the ith point.
8
9
- We will calculate all the points in the same way
10
11
12
- Allocation for each point to its Cluster
13
14
15
16
Iteration 2
Centroids
17
18
19
20
Iteration 3
Centroids
21
22
23
24
Iteration 4
Centroids
26
Example
Using the following plot to draw the final Clusters
27
Extra Examples
• https://youtu.be/_S5tvagaQRU?t=174 (Video)
• https://www.youtube.com/watch?v=wt-X61BnUCA (Video)
• http://mnemstudio.org/clustering-k-means-example-1.htm (2d)
• https://www.saedsayad.com/clustering_kmeans.htm (2d)
• https://www.datascience.com/blog/k-means-clustering (Python)
• https://pythonprogramminglanguage.com/kmeans-elbow-method
/
(WSS in Python)