Download as pptx, pdf, or txt
Download as pptx, pdf, or txt
You are on page 1of 9


Presented by: Nikita Gulati

What is Data Mining?

• Data mining is the process of sorting through

large data sets to identify patterns and establish
relationships to solve problems through data
analysis. Data mining tools allow enterprises to
predict future trends.
What is a Cluster?

• Cluster is a group of objects that belongs to the

same class. In other words, similar objects are
grouped in one cluster and dissimilar objects are
grouped in another cluster.
What is Clustering or Cluster Analysis?
• Cluster analysis or clustering is the task of
grouping a set of objects in such a way that
objects in the same group (cluster) are more
similar (in some sense) to each other than to
those in other groups. It is a main task of
exploratory data mining, and a common
technique for statistical data analysis, used in
many fields.
 While doing cluster analysis, we first partition
the set of data into groups based on data
similarity and then assign the labels to the
 The main advantage of clustering over
classification is that, it is adaptable to changes
and helps single out useful features that
distinguish different groups.
Applications of Clustering
• Biology
• Information retrieval
• Land-Use
• Marketing
• City-planning
• Earth-quake studies
• Climate
• Economic Science
Requirements of Clustering
 Scalability
 Ability to deal with different kinds of attributes
 Discovery of clusters with attribute shape
 High dimensionality
 Ability to deal with noisy data
 Interpretability
What is Fuzzy Clustering?
• Fuzzy clustering (also referred to as soft
clustering or soft k-means) is a form of
clustering in which each data point can belong to
more than one cluster.
• Clusters are identified via similarity measures.
These similarity measures include distance,
connectivity, and intensity. Different similarity
measures may be chosen based on the data or
the application.
Thank You!

You might also like