Professional Documents
Culture Documents
Clustering Techniques
Clustering Techniques
Clustering Techniques
Periodic Table
https://www.chemistryworld.com/opinion/machine-learning-
mendeleevs-have-rediscovered-the-periodic-table/
3010720.article
Helpful in vaccine
Coronaviradae Family Development
Coronaviradae Family
Cluster analysis has been widely used in many applications such as business intelligence,
image pattern recognition, web search, biology, security etc.
Clustering Techniques
Use the spam
filter for mails
Clustering
Given a set of n objects, a partitioning method constructs k partitions of the data, where each partition represents a
cluster and k ≤ n. it then uses an iterative relocation technique that attempts to improve the partitioning by moving
objects from one group to another.
Partitioning
Centroid-based clustering is the easiest of all
the clustering types in data mining. It works
on the closeness of the data points to the
chosen central value. The datasets are
divided into a given number of clusters, and
a vector of values references every cluster.
The input data variable is compared to the
vector value and enters the cluster with
minimal difference.
A hierarchical method creates a hierarchical decomposition of agglomerative or divisive, based on how the hierarchical
decomposition is formed.
Hierarchical Clustering
Hierarchical Clustering is also known as connectivity-based clustering, is based on the principle that every object is
connected to its neighbors depending on their proximity distance (degree of relationship). The clusters are represented in
extensive hierarchical structures separated by a maximum distance required to connect the cluster parts.
The clusters are represented as Dendrograms, where X-axis represents the objects that do not merge while Y-axis is the
distance at which clusters merge. The similar data objects have minimal distance falling in the same cluster, and the
dissimilar data objects are placed farther in the hierarchy. Mapped data objects correspond to a Cluster amid discrete
qualities concerning the multidimensional scaling, quantitative relationships among data variables, or cross-tabulation in
some aspects.
Hierarchical Clustering -
Types
Agglomerative Divisive
(Bottom Up) (Top Down)
1,2,3,4,5,6,7
1 2 3 4 5 6 7
Hierarchical Clustering -
Types
Agglomerative Divisive
(Bottom Up) (Top Down)
Hierarchical Clustering
Dendrogram
A dendrogram is a diagram
that shows the hierarchical
relationship between
objects. It is most
commonly created as an
output from hierarchical
clustering. The main use of
a dendrogram is to work out
the best way to allocate
objects to clusters
Hierarchical Clustering
Dendrograms
(5,6 ) b (5,6 )
X1 X1 a
(1,3 ) (1,3 )
X2 X2
Euclidean distance or Euclidean metric is the Manhattan distance is the distance between
"ordinary" straight-line distance between two two points is the sum of the absolute
points in Euclidean space differences of their Cartesian coordinates.
ED = √ 2
2 2
( 𝑥 − 𝑥 1 ) + ( 𝑦 2 − 𝑦 1 ) MD = ( 𝑎+ 𝑏 )
ED = √ ( 5 −1 ) 2
+( 6 − 3 )
2
MD = ( 3+ 4 )
ED =𝟓 MD = 𝟕
Clustering Techniques
Traffic
problem
Clustering
Most partitioning methods cluster objects based on the distance between objects. Such methods can find only spherical-
shaped clusters and encounter difficulty in discovering clusters of arbitrary shapes.
Density based
Density-based aka DBSCAN ( Density Based Special clustering with applications with Noise)
clustering method considers density ahead of distance. Data is clustered by regions of high
concentrations of data objects bounded by areas of low concentrations of data objects. The
clusters formed are grouped as a maximal set of connected data points.
Clustering Techniques
Retail
Customers
Clustering
Grid-based methods quantize the object space into a finite number of cells that form a grid structure. Using grids is often
an efficient approach to many spatial data mining problems, including clustering. Therefore, grid-based methods can be
integrated with other clustering methods such as density-based methods and hierarchical methods.
Other Clustering Techniques Constraints Based