Download as pptx, pdf, or txt
Download as pptx, pdf, or txt
You are on page 1of 11

Clustering

• unsupervised learning
• deals with the unlabeled dataset

A way of grouping the data points into different clusters, consisting of similar data points. The objects with the
possible similarities remain in a group that has less or no similarities with another group
Applications of Clustering in different
fields
• Marketing: It can be used to characterize & discover customer segments for
marketing purposes.
• Biology: It can be used for classification among different species of plants and
animals.
• Libraries: It is used in clustering different books on the basis of topics and
information.
• Insurance: It is used to acknowledge the customers, their policies and identifying the
frauds.
• City Planning: It is used to make groups of houses and to study their values based on
their geographical locations and other factors present.
• Anomaly detection
• Social network analysis
• In Search Engines
Lets assume that we have data set as {2 3 4 10 11 12 20 25 30}
We need to cluster data into 2 sets.
If mean is not given assume initial mean;
here we assume that m1=4 and m2=12 [two mean values are needed as there are two sets k1 and k2]
Step 1: considering m1 as 4 and m2 as 12 lets find out new cluster elements for k1 and k2.
K1={2,3,4}
K2={10,11,12,20,25,30}
In order to decide which elements would come in K1 and K2 , find distance between each element with
data and then decide based on smallest distance.
Compute mean m1=(2+3+4 )/ 3 =3
M2=(10+11+12+20+25+30)/6 =18
Step 2:
Based on new mean find new cluster k1 and k2.
K1={2,3,4,10}
K2={11,12,20,25,30}
Find mean m1=4.75 == 5
M2=19.6 == 20
Step 3:
New cluster k1={2,3,4,10,11,12}
K2={20,25,30}
Find mean m1=7 and m2=25
Step 4:
New cluster k1= {2,3,4,10,11,12}
K2={20,25,30}
M1 =7 and m2=25
As data sets are same in both clusters are same and mean also same ; procedure terminates

You might also like