Download as pdf or txt
Download as pdf or txt
You are on page 1of 51

Data Mining

Techniques & Applications

Cluster Detection Methods


Topics
 Problem of Cluster Detection
 Measures of Proximity
 Basic Clustering Methods
 Cluster Evaluation and Validation
 Advanced Methods
 Cluster Detection in Practice
Sarajevo School of Science and Technology 2
Overview
 What is Cluster Detection?
◼ A descriptive task
◼ A cluster refers to a group of objects among
which there exist a degree of similarities.
◼ Cluster detection or clustering is an
automatic process of discovering clusters
within a given set of data objects. The centre
of a cluster is known as the Centroid.

Sarajevo School of Science and Technology 3


Overview
 Basic Elements of a Clustering Solution:
◼ A sensible measure for similarity among data objects
◼ An effective (complete, correct) and efficient cluster
detection method
◼ A goodness-of-fit measure for evaluating the quality
of clusters
 What is a Good Clustering Result?
◼ high intra-cluster similarity and low inter-cluster
similarity
◼ Most, if not all, data objects belong to a cluster

Sarajevo School of Science and Technology 4


Overview
 Requirements for Clustering Solutions
◼ Scalability, being able to cope with large data set
◼ Being able to deal with different types of attributes
◼ Being able to discover clusters of arbitrary shapes
◼ Minimal requirements for domain knowledge to determine
input parameters
◼ Being able to deal with noise and outliers
◼ Insensitive to order of input data records
◼ Being able to deal with High dimensionality
◼ Incorporation of user-specified constraints
◼ Interpretability and usability

Sarajevo School of Science and Technology 5


Overview
 Challenges for Clustering
◼ Similarity is relative concept
◼ There may be no meaningful clusters that
exist in the given set
◼ Clusters can be of poor quality

Sarajevo School of Science and Technology 6


Proximity: The Basics
 Proximity can be measured either as similarity or
dissimilarity.
 Similarity is a numeric measure of the degree to which two
data objects are alike where as dissimilarity is a numeric
measure of the degree to which two objects differ
 Similarity measure and dissimilarity measure can be
converted from one to the other. Normally, dissimilarity is
preferred because it is easier to measure distance between
two data objects.
 Measure of distance between two data objects involves
combining measures of difference between values of the
corresponding attributes for the two data objects.

Sarajevo School of Science and Technology 7


Proximity: The Basics
 Difference of values for a single attribute is
directly related to the domain type of the
attribute.
 Distance functions satisfying the following
properties are called metric:
◼ d(x,y) ≥ 0 and d(x,x) = 0, for all data x and y
◼ d(x,y) = d(y,x), for all data objects x and y
◼ d(x,y) ≤ d(x,z) + d(z,y), for all data x, y and z

Sarajevo School of Science and Technology 8


Proximity:
Difference Between Attribute Values
 Difference between values of nominal
attributes
◼ Since the only operator applicable to nominal
attribute are = or ≠, if two names are the same,
the difference should be 0; otherwise it should
be the maximum (∞).
 e.g. For data objects <“Mary”, f>, <“John”, m> and
<“Liz”, f>, as far as gender is concerned, Mary and
Liz are similar whereas John and Liz are not.
◼ Difference measures for binary attributes is the
same
Sarajevo School of Science and Technology 9
Proximity:
Difference Between Attribute Values
 Difference between values of ordinal attributes
◼ Since there exists order among ordinal values. different
degree of difference can be represented and compared.
 e.g. For ordinal attribute with values {bad, OK, good,
excellent}, the difference between good and excellent are
less than that between OK and excellent.
◼ Converting ordinal values to consecutive integers
 e.g. For the example above, {bad, OK, good, excellent} can
be converted to {0, 1, 2, 3}. Then d(good, excellent) = 3 – 2
= 1 whereas d(OK, excellent) = 3 – 1 = 2.
 Distance measure for interval and ratio attributes

Sarajevo School of Science and Technology 10


Proximity:
Dissimilarity Between Data Objects
 Ratio of mismatched
Body Weight Body Height Blood Pressure Blood Sugar Level Habit Class
heavy short high 3 smoker P
heavy short high 1 nonsmoker P

features for nominal


normal tall normal 3 nonsmoker N
heavy tall normal 2 smoker N
low medium normal 2 nonsmoker N

attributes
low tall normal 1 nonsmoker P
normal medium high 3 smoker P
low short high 2 smoker P
heavy tall high 2 nonsmoker P

◼ Given two data objects


low medium normal 3 smoker P
heavy medium normal 3 nonsmoker N

i and j of p nominal
attributes. Let m d (i, j) = p −p m
represent the number
of attributes where the d (row1, row2) =
6−4 1
=
values of the two 6 3

objects match. d (row1, row3) =


6 −1 5
=
6 6

Sarajevo School of Science and Technology 11


Proximity:
Dissimilarity Between Data Objects
 Simple Match Coefficient/Jaccard Coefficient for
Binary Attributes
◼ Given two data objects i and j of p binary attributes. Let
 f00 = the number of attributes where both objects have
value 0
 f01 = the number of attributes where i has 0 and j has 1
 f10 = the number of attributes where i has 1 and j has 0
 f11 = the number of attributes where both objects have
value 1
◼ Simple Mismatch Coefficient (SMC) is defined as
f 01 + f10
SMC(i, j ) =
f 00 + f 01 + f10 + f11

◼ Jaccard Coefficient is defined as JC(i, j ) =


f 01 + f10
f 01 + f10 + f11

Sarajevo School of Science and Technology 12


Proximity:
Dissimilarity Between Data Objects
 Simple Match Coefficient/Jaccard
Coefficient for Binary Attributes
Name Gender Fever Cough Test-1 Test-2 Test-3 Test-4
Jack M Y N P N N N
Mary F Y N P N P N
Jim M Y P N N N N

0 +1
d ( jack , mary ) = = 0.33
2 + 0 +1
1+1
d ( jack , jim) = = 0.67
1+1+1
1+ 2
d ( jim, mary ) = = 0.75
1+1+ 2

Sarajevo School of Science and Technology 13


Proximity:
Dissimilarity Between Data Objects
 Minkowski Distance Function for
Interval/Ratio Attributes
◼ Given two data objects i and j of p
interval/ratio attributes, the Minkowski
distance between the two objects is defined
as
d (i, j) = q (| xi1 − x j1 | + | xi2 − x j 2 | +...+ | xip − x jp | )
q q q

Sarajevo School of Science and Technology 14


Proximity:
Dissimilarity Between Data Objects
 Minkowski Distance Function for
Interval/Ratio Attributes
◼ Special Cases:
 Manhattan distance (q = 1)
d (i, j) =| x − x | + | x − x | +...+ | x − x |
i1 j1 i2 j2 ip jp

 Euclidean distance (q = 2)
d (i, j) = (| x − x | 2 + | x − x | 2 +...+ | x − x | 2 )
i1 j1 i2 j2 ip jp

 Supremum (q = ∞)
d (i, j ) = max it − jt .
t

Sarajevo School of Science and Technology 15


Proximity:
Dissimilarity Between Data Objects
 Minkowski Distance Function for
Interval/Ratio Attributes
customerID No of Trans Revenue Tenure(Months)
101 30 3000 20
102 40 400 10
103 35 2000 30
104 20 1000 35
105 50 500 1
106 100 100 10
107 10 1000 2

d1 (cust 101, cust 102 ) =| 30 − 40 | + | 3000 − 400 | + | 20 − 10 |= 2620

d 2 (cust101, cust102 ) = (30 − 40) 2 + (3000 − 400 ) 2 + (20 − 10) 2  2600

d max (cust 101, cust 102 ) =| 3000 − 400 |= 2600

Sarajevo School of Science and Technology 16


Proximity:
Dissimilarity Between Data Objects
 Cosine Similarity Among Attributes
◼ Treating two data objects as vectors
◼ Similarity is measured as the angle θ between two vectors
◼ Similarity is 1 when θ = 0, and 0 when θ = 90°
◼ Similarity function:

i• j n
j  i • j =  ik jk
n
cos(i, j ) =
|| i ||  || j || k =1
|| i ||= i
k =1
k
2

i
◼ e.g. Given two vectors: x = (3, 2, 0, 5) and y = (1, 0, 0, 0),
x • y = 3*1 + 2*0 + 0*0 + 5*0 = 3
||x|| = sqrt(32 + 22 + 02 + 52) ≈ 6.16
||y|| = sqrt(12 + 02 + 02 + 02) = 1
◼ Similarity: cos(x, y) = 3/(6.16 * 1) = 0.49
Dissimilarity: 1 – cos(x,y) = 0.51

Sarajevo School of Science and Technology 17


Measures of Proximity
 Dealing with Objects of Different Attribute Types
◼ A hybrid measure based on the principle of ratio of
mismatch
 For the kth attribute, compute the dissimilarity dk(i, j) in
range [0,1]
 Set the indicator variable δk as follows:
◼ δk = 0, if the kth attribute is an asymmetric binary attribute and
both objects have value 0 for the attribute
◼ δk = 1, otherwise n

 k  dk
 Compute the overall dissimilarity d (i, j ) = k =1
n

between i and j as: 


k =1
k

Sarajevo School of Science and Technology 18


Measures of Proximity
 Attribute Scaling and Weighting in Proximity
◼ Attribute Scaling
 When scaling should be considered:
◼ on the same attribute when data from different data sources are merged
◼ on different attributes when data is projected into the N-space
 Normalizing variables into comparable ranges:
◼ divide each value by the mean: new = old/mean
◼ divide each value by the range: new = (old – min)/(max – min)
◼ z-score: new = (old – mean)/(absolute mean deviation)
◼ Attribute Weighting n

 The weighted overall dissimilarity function: w


k =1
k   k  dk
d (i, j ) = n


k =1
k

Sarajevo School of Science and Technology 19


Basic Clustering Methods
The K-means Method
 Overview
◼ A partition-based clustering method
◼ Only one version of clustering is stored
 Outline of Steps
◼ Choose the number of clusters to be formed (k)
◼ Choose the initial centroids for k clusters (randomly). The
centroid is the only member of the cluster.
◼ Assign each data object to its nearest cluster
◼ Re-calculate the centroid of each cluster by taking the mean
vector of its members
◼ Undo the memberships and Repeat Step 3 and Step 4 until no
more membership changes or a maximum number of iterations
is reached.
Sarajevo School of Science and Technology 20
Basic Clustering Methods
The K-means Method
 Explanations
◼ The value of K is chosen by the user. The best value for K is an issue
relating to the clustering quality.
◼ The initial K partition:
 Dividing the data set into K partitions randomly, and use data objects in each
partition to compute the first-round centroid for the partition
 Selecting K records as the initial centroids randomly, and start to assign data
objects to the clusters
 Using heuristics to anticipate the locations of the initial centroids
◼ The termination of the Loop
 Checking is there any change of membership, but can be expensive
 Checking if the locations of all centroids change or not (may not always work)
 Using a minimum % of data objects that change their membership
 Maximum number of iterations

Sarajevo School of Science and Technology 21


Basic Clustering Methods
The K-means Method
 Illustrated

Sarajevo School of Science and Technology 22


Basic Clustering Methods
The K-means Method
 Strengths & Weaknesses
◼ Strengths
 Simple and easy to implement
 Quite efficient
◼ Weaknesses
 Sensitive to the choice of initial k centroids
 Need to specify k beforehand
 Applicable only when mean is meaningfully defined
 Variants of the K-means Method
◼ K-medoid method, using the nearest data object to the virtual
centre as the centroid.
◼ K-mode method, calculating mode instead of mean for the
centre of the cluster.

Sarajevo School of Science and Technology 23


Basic Clustering Methods
The K-means Method
 Additional Issues
◼ How to improve the way that initial K centroids are chosen
 Finding the ideal K: running the clustering many times and select the clustering
result with the best quality
 Finding the centroids: using hierarchical clustering to locate the centers, or
locating the centers that are farther apart
◼ Possibility of an empty cluster and incremental update of centroid
◼ Removing outliers before clustering?
◼ Improving cluster quality by post-processing
 Cluster split for low quality clusters
 Cluster merging for close-by clusters
 Disperse cluster without too much reducing quality measure
◼ The Bisecting K-means method: start with a single loose cluster, and
repeatedly bisect one cluster into two smaller but better quality clusters
until K clusters are formed.

Sarajevo School of Science and Technology 24


Basic Clustering Methods
The Agglomeration Method
 Overview
◼ A hierarchical clustering method
◼ All versions of clustering are stored
 Outline of Steps
◼ Take all n data objects as individual clusters, and build a
dissimilarity matrix of size n x n. The matrix stores the distance
between any pair of data objects.
◼ While the number of clusters > 1 do:
 Find a pair of data objects with the minimum distance
 Merge the two data objects into a bigger cluster
 Replace the entries in the matrix for the original clusters (objects) by
the centroid of the newly formed cluster
 Re-calculate relevant distances and update the matrix

Sarajevo School of Science and Technology 25


Basic Clustering Methods
The Agglomeration Method
 Illustrated

•••••
Sarajevo School of Science and Technology 26
Basic Clustering Methods
The Agglomeration Method
 Dendrogram

 Strength/Weakness
◼ Strengths: no need for k, multiple versions of clustering
◼ Weaknesses: not so efficient. Does not scale up well.
◼ Cannot undo what was done in the previous iteration (cf. the K-means
method)

Sarajevo School of Science and Technology 27


Basic Clustering Methods
The Agglomeration Method
 Similarity of Clusters
◼ The key operation of the Agglomeration method
is the calculation of similarity between clusters,
when deciding which to merge
◼ The different ways of calculating similarity
between clusters:
 Single link (minimum): the distance between two
closest points of the two clusters
 Complete link (maximum): the distance between two
farthest points of the two clusters
 Group average: the average of all pair-wise distances
 Centroids: the distance between the centroids of the
two clusters

Sarajevo School of Science and Technology 28


Cluster Evaluation & Validation
 The Issues
◼ Cluster Tendency (Do clusters really exist in the given
data set?)
◼ Cluster Quality (How to distinguish good and bad
clustering results?)
◼ Cluster interpretability (Do we understand what clusters
suggest?)
 Cluster Tendency
◼ Use cluster quality measures as an indication
◼ Compare data features of clusters against those of the
whole data set. If there are no significant differences, the
clusters may be superficial.
Sarajevo School of Science and Technology 29
Cluster Evaluation & Validation
 Cluster Tendency - Hopkins statistic

 dist( p, t
p , t p S
p)

H ( P, S ) =
 dist(m, t m) +  dist( p, t p)
m , t m P p , t p S A
where:
P: a set of n randomly generated data points
S: a sample of n data points from the data set
tp: the nearest neighbor of point p in S
tm: the nearest neighbor of point m in P
e.g. After a number of trials,
average(HA) ≈ 0.56, and B
average(HB) ≈ 0.89
There is no cluster converging in figure A but there
is in figure B.

Sarajevo School of Science and Technology 30


Cluster Evaluation & Validation
 Cluster Quality Measures
◼ High intra-cluster similarity
◼ Low inter-cluster similarity
◼ Cohesion measure: sum of squared error (WC)
◼ Separation measure: sum of distances between clusters (BC)
◼ The ratio BC/WC is a good measure of quality, combining the
cohesion measure and separation measure. The higher the ratio,
the better the clustering result.
 1  1
 d (r , r )
K K
1 1 BC =
    
2
WC = wc (C k ) =  d ( x , rk
2
)  j k
K K k =1  C k
K
k =1 xC k  1 j  k  K

where rk is the centroid of cluster k, and rj the centroid of


cluster j
Sarajevo School of Science and Technology 31
Cluster Evaluation & Validation
 How to Determine K for K-means Method?
◼ Add an outer loop for different values of K (from
low to high)
◼ At each iteration, conduct clustering using the
current K.
◼ Then use the cluster quality evaluation measure
to decide whether the resulting cluster quality is
acceptable
◼ If not, increase the value of K by 1 and move to
the next iteration
Sarajevo School of Science and Technology 32
Cluster Evaluation & Validation
 How to determine Which Level of Hierarchy
for Agglomeration Method?
◼ Start traversing the hierarchy level by level from
the root.
◼ At each level, evaluate the quality of clusters
using the cluster quality measure.
◼ If the quality is acceptable, then the current
level of clusters is taken as the final result. If
not, move to the next level and evaluate the
quality again until reaching the desirable level.
Sarajevo School of Science and Technology 33
Cluster Evaluation & Validation
 Use of Scree Plot
◼ Determining the right
number of clusters is a
problem of optimization.
The more clusters there
are, the better quality of
the clusters, but the more
difficult to interpret the
meaning.
◼ SSE of clusters found
using K-means (©Tan et
al 2007)
Sarajevo School of Science and Technology 34
Cluster Interpretation
 Within Clusters
◼ Look into cluster features in terms of
attribute values
◼ Comparing attribute values within the
cluster against attribute values in the
data population
 Outside Clusters
◼ Majority of data objects within
clusters are considered as norm
whereas minority of objects outside
as exceptions
 Between Clusters
◼ Compare differences between
clusters
◼ Investigating possible associations
between objects from different
clusters

Sarajevo School of Science and Technology 35


Cluster Interpretation
 Within Clusters
◼ Look into cluster features in terms of attribute values
◼ Comparing attribute values within the cluster against attribute
values in the data population
 Outside Clusters
◼ Majority of data objects within clusters are considered as norm
whereas minority of objects outside as exceptions
 Between Clusters
◼ Compare differences between
clusters
◼ Investigating possible associations
between objects from different clusters

Sarajevo School of Science and Technology 36


Basic clustering methods
The limitations
 Resulting clusters of convex shapes,
failing to detect clusters of arbitrary
shapes
 Resulting clusters of similar sizes, failing
to detect clusters of different sizes
 Distorted clusters for data with noises
and outliers (the agglomeration method
solves this problem to a certain extent)
 It is difficult to define mean for nominal
attributes.
 Similarity function does not consider
existing cluster shapes

Sarajevo School of Science and Technology 37


Advanced Clustering
The Overview
 Prototype based Methods (any data object is closer to the prototype that
defines its own cluster)
◼ Partition-based (e.g. K-means)
◼ Fuzzy clustering (e.g. Fuzzy c-means method (FCM))
◼ Mixture models (e.g. Gaussian Mixture Model (GMM))
◼ Relationship clustering (e.g. Self Organizing Map (SOM))
 Density based clustering (clusters are regions of high density)
◼ Grid based clustering (e.g. DBSCAN)
◼ Subspace clustering (e.g. CLIQUE)
◼ Transformation clustering (e.g. WaveCluster)
 Graph based clustering (data set is a graph of nodes and links with
weights)
◼ e.g. Minimum spanning tree (MST) clustering
◼ e.g. Optimal Partitioning (OPOSSUM)
◼ e.g. Chameleon

Sarajevo School of Science and Technology 38


Advanced Clustering
DBSCAN Method
 Overview
◼ A center-based density clustering method
◼ Density of a data object is estimated by counting the number of
points within a certain radius (Eps)
◼ According to a user-defined Eps, every data object (point) can
be categorized as either of the following:
 Core point (in the interior of a cluster): within a given Eps, the
number of neighboring points exceeds a given threshold MinPts
 Border point (on the border of a cluster): not a core point, but within
a given Eps of a core point, with less neighboring points than
required by the threshold MinPts
 Noise Point (in a sparsely occupied region):
the ones that neither a core nor a border point,
outside the Eps of any core point

Sarajevo School of Science and Technology 39


Advanced Clustering
DBSCAN Method
 Algorithm steps
◼ Label all data points as core, border, or noise
◼ Eliminate noise points
◼ Assign an edge between all core points that are within Eps of each other
◼ Make each group of connected core points into a separate cluster
◼ Assign each border point to the cluster of its associated core point
 Setting Parameters
◼ Choose a fairly small value for k (e.g. k = 4)
◼ Calculate distances from each data
point to its k nearest neighbors
◼ Plot the sorted distance to the
kth neighbor of each data point and
select the value at which there is a
sharp increase as the Eps value
(© Tan et al 2007)

Sarajevo School of Science and Technology 40


Advanced Clustering
DBSCAN Method
 Illustrated

Original Data Points Core, border, noise

Clustering Results

Sarajevo School of Science and Technology 41


Advanced Clustering
The Chameleon Method
 Overview
◼ A graph based clustering method using dynamic modeling
◼ Use two proximity measures to measure similarity
 Relative interconnectivity
 Relative closeness
 Rationales:
◼ single-link agglomeration only considers closeness
◼ Group-average agglomeration only considers connectivity
◼ Closeness and connectivity in similarity have the right balance
◼ The existing clusters should influence further clustering
 In the agglomeration step, the merged cluster is similar to
the two original clusters

Sarajevo School of Science and Technology 42


Advanced Clustering
The Chameleon Method
 Method Outline
◼ Build a k-nearest neighbor graph among
data objects
◼ Partition the graph into smaller size clusters
using a graph partitioning algorithm
◼ Repeat:
Merge the smaller clusters to bigger clusters
that preserve the cluster self-similarity with
respect to relative interconnectivity and
relative closeness
until no more clusters can be merged

Sarajevo School of Science and Technology 43


Advanced Clustering
The Chameleon Method
 Steps 1-nearest
neighbour
◼ Building a k-nearest neighbor graph
graph among data objects
 Nodes: data items
2-nearest
 Link between v and u: if v is among neighbour
k nearest neighbor of u, or u is graph
among k nearest neighbor of v
 Weight of the link: similarity 3-nearest
between two data items indicating neighbour
the density of data graph

Sarajevo School of Science and Technology 44


Advanced Clustering
The Chameleon Method
 Steps
◼ Partitioning the k-nearest neighbor
graph
 Edge cut: the sum of weights of the
edges that straddle the partitions.
 Use a graph partitioning algorithm
named METIS
 During the partitioning, minimize the
edge cuts
 Rationale: links within a cluster should be
more and stronger than links across
clusters. Minimizing edge cut is to
minimize relationships between
partitions.

Sarajevo School of Science and Technology 45


Advanced Clustering
The Chameleon Method
 Steps
◼ Merging: Agglomeration step
 Conduct hierarchical clustering as in the agglomeration method
 At each iteration of agglomeration, use relative interconnectivity and relative
closeness to decide which two clusters to merge
absolute interconne ctivity between Ci and C j EC (Ci , C j )
RI (Ci , C j ) = =
the average of internal interconne ctivities in Ci and C j 1
( EC (Ci ) + EC (C j ))
2

where EC(Ci, Cj): edge cut between Ci and Cj


EC(C): edge cut that partition C into two roughly equal parts
absolute closeness between Ci and C j AEC ( Ci ,C j )
RC (Ci , C j ) = =
the average of internal closeness in Ci and C j | Ci | |Cj |
AEC ( Ci ) + AEC ( C j )
| Ci | + | C j | | Ci | + | C j |

where AEC(Ci, Cj): average of edge weights between Ci and Cj


AEC(C): average of weights of the edges that partition C into two roughly equal parts

Sarajevo School of Science and Technology 46


Advanced Clustering
The Chameleon Method
 Comparison

DBScan Chameleon

CURE

Sarajevo School of Science and Technology 47


Clustering in Practice
 Steps of a Typical Clustering Project
◼ Locate data set and select clustering attributes (some algorithms
can determine which attributes to use automatically)
◼ Prepare data such as normalization of data values and weighting
of attributes
◼ If possible, choose a sensible similarity function
◼ Choose one or more clustering methods
◼ Conduct clustering with user-defined parameters
◼ Validate cluster tendency
◼ Evaluate cluster quality, if necessary, redo clustering with
different parameters
◼ Cluster interpretation and further analysis

Sarajevo School of Science and Technology 48


Clustering in Practice
 How to Choose Clustering Algorithms?
◼ Types of Clustering (taxonomy or partition, complete or incomplete)
◼ Types of Cluster (globular or non-globular)
◼ Characteristics of Clusters (sub-space, arbitrary shaped, spatial)
◼ Characteristics of the Data Set (is similarity measure sensible?)
◼ Presence of Noise and Outlier (with, without, as specific cluster)
◼ Size of the Data Set (scalable algorithm for large data set)
◼ Number of Attributes (scalability and meaning of clusters)
◼ Cluster Description and Its Size (centroids, parameters, members)
◼ Algorithmic Concerns
 Order among data important?
 The number of clusters determined automatically?
 Does algorithm optimization objective match application objective?

Sarajevo School of Science and Technology 49


Summary
 A clustering solution must include a sensible similarity measure function,
an efficient algorithm and a quality evaluation function
 Different similarity measures apply to different types of data
 K-means is a partition based simple algorithm with limitations
 Agglomeration method is a hierarchical algorithm of poor efficiency
 Advanced algorithms include graph-based, density-based and model-
based methods that aim to overcome the limitations of simple methods
 A range of issues should be considered when performing clustering
 Sensible similarity and optimal outcome of clusters are the essential
factors of success
 Conducting repeated clustering using different clustering solutions and
parameters (if possible) is desirable in validating clustering results.

Sarajevo School of Science and Technology 50


Further Reading
 Hongbo Du, „Data Mining Techniques and
Applications“, Chapters 4 & 5
 Tan, P., Steinbach, M. & Kumar, V.
“Introduction to Data Mining”, Chapter 8 &
9, Addison-Wesley, 2006
 Berry & Linoff, “Data Mining Techniques for
Marketing, Sales and Customer Support”,
Chapter 10, Wiley, 1997
Sarajevo School of Science and Technology 51

You might also like