Download as pdf or txt
Download as pdf or txt
You are on page 1of 11

See discussions, stats, and author profiles for this publication at: https://www.researchgate.

net/publication/283762749

Hierarchical clustering algorithms for segmentation of multispectral images

Article  in  Optoelectronics Instrumentation and Data Processing · July 2015


DOI: 10.3103/S8756699015040020

CITATIONS READS

11 828

3 authors, including:

I. A. Pestunov Vladimir B Berikov


Institute of Computational Technologies Russian Academy of Sciences
14 PUBLICATIONS   67 CITATIONS    60 PUBLICATIONS   301 CITATIONS   

SEE PROFILE SEE PROFILE

Some of the authors of this publication are also working on these related projects:

Analysis of hyperspectral images with cluster ensembles View project

Mathematical methods for the diagnosis of acute stroke based on computer analysis of CT images using convolutional neural networks and deep learning View project

All content following this page was uploaded by Vladimir B Berikov on 25 March 2016.

The user has requested enhancement of the downloaded file.


c Allerton Press, Inc., 2015.
ISSN 8756-6990, Optoelectronics, Instrumentation and Data Processing, 2015, Vol. 51, No. 4, pp. 1–10. 
c I.A. Pestunov, S.A. Rylov, and V.B. Berikov, 2015, published in Avtometriya, 2015, Vol. 51, No. 4, pp. 12–22.
Original Russian Text 

ANALYSIS AND SYNTHESIS OF SIGNALS AND IMAGES

Hierarchical Clustering Algorithms


for Segmentation of Multispectral Images

I. A. Pestunova , S. A. Rylova , and V. B. Berikovb


a
Institute of Computational Technologies, Siberian Branch of Russian Academy of Sciences,
pr. Lavrent’eva 6, Novosibirsk, 630090 Russia
b
Sobolev Institute of Mathematics, Siberian Branch of Russian Academy of Sciences,
pr. Akademika Koptyuga 4, Novosibirsk, 630090 Russia
E-mail: pestunov@ict.nsc.ru
Received January 28, 2015

Abstract—Computationally efficient HCA and HECA hierarchical clustering algorithms for segmen-
tation of multispectral images have been developed using the grid and ensemble approaches. A special
metric is proposed to identify embedded clusters even in the presence of overlapping. The efficiency of
the algorithms has been confirmed by the results of experimental studies using model and real data.
Keywords: ensemble hierarchical clustering algorithm, grid approach, segmentation of multispectral
satellite images.
DOI: ?

INTRODUCTION

Segmentation is one of the most important steps in the analysis of digital images [1, 2]. It consists of
dividing an image into non-overlapping regions based on similarity of their spectral and/or spatial charac-
teristics (texture, size, shape, etc.). The most common approach to the segmentation of satellite images is
based on data clustering algorithms [3].
Clustering methods can be divided into two major groups: hierarchical and non-hierarchical. Non-
hierarchical algorithms provide fixed clustering of data and hierarchical algorithms yield a system of em-
bedded clusters corresponding to different hierarchical levels . Hierarchical representation is convenient in
interpreting results in the cases where information on the various levels of the cluster structure is required,
as well as in situations where the exact number of desired clusters is unknown. Traditional methods of
hierarchical clustering have some disadvantages. For example, the single linkage procedure is susceptible
to the so-called chain effect, and the complete and average linkage methods usually work well only with
spherical clusters. Furthermore, these methods do not allow separating overlapping clusters [4]. Another
serious drawback of these methods is their high computational complexity, which does not allow them to be
used for large data arrays such as multispectral images.
The ensemble approach has recently been widely used to improve the stability and performance of clus-
tering [5–10]. However, methods based on hierarchical ensemble clustering has been the subject of only few
papers [11, 12]. Furthermore, the algorithms used in them are also computationally time consuming.
The objective of this study is to develop and study computationally efficient HCA and HECA hierarchical
clustering algorithms for segmentation of multispectral satellite images. The algorithms provide separation
of overlapping clusters, and clusters of complex shape and different size and density. The proposed work is
aimed at combining the grid and ensemble approaches to clustering and is a continuation of our previous
studies [13–16].

1
2 PESTUNOV, et al.

HCA GRID HIERARCHICAL CLUSTERING ALGORITHM

The HCA hierarchical clustering algorithm considered here is based on the CCA algorithm [13], which
combines the advantages of the grid [17] and density approaches: high computational efficiency and ability to
identify clusters with complex structure. To describe the proposed algorithm, we introduce several definitions.
Let the set of objects X being classified consist of vectors lying in the feature space Rd :
X = {xi = (x1i , . . . , xdi ) ∈ Rd , i = 1, N }. The vectors xi lie in a rectangular hyperparallelepiped
j j
Ω = [l1 , r1 ] × . . . × [ld , rd ], where lj = minxi ∈ X xi , rj = maxxi ∈ X xi . By the grid structure we mean the
partition of the feature space by hyperplanes: x = (r − l )i/m + lj , i = 0, . . . , m, m is the number of
j j j
partitions Ω in each dimension. The minimum element of this structure is a cell (closed rectangular hyper-
parallelepiped bounded by hyperplanes). We introduce a common numbering of the cells (sequentially from
one layer of cells to another). Cells Bi and Bj (i = j) are adjacent if their intersection is not empty. The
set of cells adjacent to B will be denoted by AB . The density DB of the cell B is the ratio DB = NB /VB ,
where NB is the number elements of the set X that fall in the cell B; VB is the volume of the cell B.
We assume that the cell B is nonempty if DB > 0.
The nonempty cell Bi is directly connected to the nonempty cell Bj (Bi → Bj ), if Bj is the cell with
the maximum number that satisfies the conditions Bj = arg max DBk and DBj ≥ DBi . The nonempty
Bk ∈ AB
i
adjacent cells Bi and Bj are directly connected to (Bi ↔ Bj ) if Bi → Bj or Bj → Bi . The nonempty
cells Bi and Bj (Bi ∼ Bj ) are connected if there exist k1 , . . . , kl such that k1 = i, kl = j, and for all,
p = 1, . . . , l − 1, we have Bkp ↔ Bkp + 1 .
The introduction of the connectedness relationship leads to the natural partition of the set of nonempty
cells into connectedness components {G1 , . . . , GS }. By the connectedness component we mean the max-
imum set of pairwise connected cells. The cell Y (G) with the maximum number satisfying the condi-
tion Y (G) = arg max DB will be called a representative of the connectedness component G.
B∈G
In the proposed HCA algorithm, the elements of the hierarchy are the connectedness components intro-
duced above, and not the elements of the initial data. Due to the use of the grid structure, the number of
the obtained connectedness components is relatively small, so that the construction of hierarchy does not
require high computationally efforts.
To construct a hierarchical grid clustering algorithm, we introduce a metric between the connectedness
components. The components Gi and Gj are considered adjacent if there are adjacent cells Bi and Bj such
that Bi ∈ Gi and Bj ∈ Gj . We define the distance between the adjacent connected components Gi and Gj
by the formula
 
hij = min 1 − min DBk / min(DYi , DYj ) .
Pij ∈ ij Bkt ∈ Pij t

Here ij = {Pij } is the set of all chains between the representatives of the connectedness compo-
nents Pij = Yi = Bk1 , . . . , Bkt , Bkt + 1 , . . . , Bkl = Yj
such that for all t = 1, . . . , l − 1: 1) Bkt ∈ Gi ∪ Gj ,
2) Bkt , Bkt + 1 are adjacent cells.
We construct the matrix of distances between the connectedness components based on the distances
between the adjacent connectedness components {hij } as follows. Let Θ(Gi , Gj ) = Θij = {Qij } be the set
of all chains of the connectedness components of Qij = Gi = Gk1 , . . . , Gkt , Gkt + 1 , . . . , Gkl = Gj
such
that for all t = 1, . . . , l − 1, the components Gkt , Gkt + 1 are adjacent. Then the distance between arbitrary
connectedness components Gi and Gj can be calculated by the formula
 
h̃ij = min max hkt , kt + 1 .
Qij ∈ Θij t

If the set Θij is empty, we assume that h̃ij = 1.


For a fixed chain of connectedness components Qij = Gi = Gk1 , . . . , Gkt , Gkt + 1 , . . . , Gkl = Gj
, we
define the length as d(Qij ) = max hkt , kt + 1 . Then, the distance between the connectedness components Gi
t
and Gj can be rewritten as

h̃ij = min [d(Qij )].


Qij ∈ Θij

OPTOELECTRONICS, INSTRUMENTATION AND DATA PROCESSING Vol. 51 No. 4 2015


HIERARCHICAL CLUSTERING ALGORITHMS FOR SEGMENTATION 3

The introduced distance based on the estimation of the data distribution density eliminates the problem
of overlapping classes inherent in hierarchical methods [18].
It is easy to show that the introduced ratio h̃ij is an ultrametric [19], i.e., a metric that satisfies the strong
triangle inequality: h̃ij ≤ max(h̃ik , h̃kj ), ∀ i, j, k. There is one-to-one correspondence between the distance
matrices having the ultrametrics property and dendrograms [11], so that they can be used as descriptors for
hierarchical results.
Calculation of the ultrametric {h̃ij } from the distance matrix {hij } is the minimum transitive closure
operation [12]. Its implementation often involves the use of computationally time consuming algorithms.
For example, Mirzaei and Rahmati [11] employ the matrix multiplication method, whose computational
complexity is O(n4 ) (for a n × nmatrix). Zheng et al. [12] use a modification of the Floyd–Warshall
algorithm with computational complexity O(n3 ). Skiena [20] proposes to use a recursive pass over all chains
with search into depth (width), starting from each element. This approach works well for sparse graphs, but
for dense graphs, its complexity reaches O(n3 ).
In this work, the operation of minimum transitive closure is performed using single linkage (SLINK)
algorithm for constructing dendrograms [21], which has complexity O(n2 ). The following proposition shows
that the application of the single linkage method to the distance matrix {hij } yields a dendrogram equivalent
to the desired ultrametric.
Proposition. The dendrogram constructed using the ultrametric {h̃ij } coincides with the dendrogram
formed using the distance matrix {hij } using the single linkage method.
Proof. We show that for each level of the dendrogram, we obtain the same relations of equivalence
(belonging to the same cluster) R̃α and Rα .
We choose a certain value α ∈ [0; 1]. In accordance with the single linkage method, the elements i and
j belong to the same cluster if and only if there exists a chain Qij ∈ Θij in which the lengths of all the
segments do not exceed α, i.e., d(Qij ) ≤ α.
Suppose that two different dendrograms are constructed using the matrices {hij } and {h̃ij }. Let Rα and
R̃α be the equivalence relations obtained by truncating these dendrograms at the levels corresponding to the
threshold value α.
Two cases are possible.
1. Let i and j belong to the same cluster according to the relation Rα :
iRα j ⇔ ∃Qij ∈ Θij : d(Qij ) ≤ α ⇒ h̃ij ≤ α ⇒ iR̃α j.

Thus, i and j belong to the same cluster by the relation R̃α .


2. Let i and j belong to different clusters by the relation Rα :
¬(iRα j) ⇔ ∀Qij ∈ Θij : d(Qij ) > α ⇒ h̃ij = min [d(Qij )] > α ⇒
Qij ∈ Θij
˜ ij ) ≥ h̃ij > α ⇒ ¬(iR̃α j).
⇒ ∀Qij ∈ Θij : d(Q

Thus, i and j belong to different clusters by the relation R̃α .


Consequently, the relations R̃α and Rα coincide for each value of α, i.e., the constructed dendrogram
coincide.
The proposed hierarchical grid algorithm HCA(m) can be written as a sequence of major steps.
Step 1. Formation of a cellular structure. In this step, for each point xi ∈ X, the cell containing it is
determined and the densities DB of all cells are calculated.
Step 2. Identification of the connectedness components {G1 , . . . , GS } and their representa-
tives Y (G1 ), . . . , Y (GS ).
Step 3: Calculation of the distance matrix {hij } between adjacent connectedness components in accor-
dance with the above definition.
Step 4. Constructing a dendrogram for the distance matrix {hij } using the single linkage method.
The HCA algorithm with low computational cost can be used to indentify complex multimode clusters,
and provides information on the hierarchical structure of data. Figure 1 shows the Bananas model 1 consisting
of 400 two-dimensional points grouped into two linearly inseparable classes. The model is constructed using
the PRTools [22] with a parameter of 0.7. However, experimental studies have shown that the results of the
proposed algorithm greatly depend on the parameter m, which determines the scale of the elements of the
grid structure. Figure 2 shows a graph of the dependence of the clustering accuracy on the grid parameter m
when applying the HCA algorithm to the Bananas model.

OPTOELECTRONICS, INSTRUMENTATION AND DATA PROCESSING Vol. 51 No. 4 2015


4 PESTUNOV, et al.

Y Y
250 250

200 200

150 150

100 100

50 50

0 50 100 150 200 250 X 0 50 100 150 200 250 X

Fig. 1. Bananas model (left) and the result of reference clustering (right).

100
4
95
8
Clustering accuracy, %

90
85
2 1 grid
80
75
70
65
60
55
0 10 15 20 25 30 35 40 m

Fig. 2. Clustering accuracy versus the initial grid parameter m using the Bananas model for
the HCA algorithm (one grid) and for the HECA ensemble algorithm for different number
of grids (2, 4, 8).

HECA HIERARCHICAL ENSEMBLE CLUSTERING ALGORITHM

In recent years, ensemble clustering methods have been actively used to improve the stability of the
results [8]. The ensemble approach provides a significant improvement in the results of the HCA algorithm.
The instability of the results of the HCA algorithm under a change in the scale grid is due to the fact
that, with a fine grid, there are a large number of connectedness components, the noise effect increases, and
the connections between adjacent components are lost. On the other hand, with a coarse grid, there are
problems with the separation of overlapping classes (Fig. 3). Therefore, the stability of the results can be
significantly improved by combining different-scale information using the ensemble method.
In this paper, we propose a HECA hierarchical ensemble clustering algorithm in which the elements of
the ensemble are the results of the HCA algorithm for fixed values of the grid parameter m. A collective
solution is constructed using the consistent distance matrix obtained by averaging the distance matrices over
all elements of the ensemble. The theory of this method of constructing collective solutions is given in [15].
Thus, to generate an ensemble, the grid HCA algorithm proposed above is run L times for different
values of the grid parameter m. The result is L matrices of distances between the connectedness compo-
(1) (L)
nents {h̃ij }, . . . , {h̃ij }. Note that in this step, the matrices must have the ultrametric property, i.e., the
transitive closure operation should be performed. The consistent distance matrix {Hij } (whose size coincides

OPTOELECTRONICS, INSTRUMENTATION AND DATA PROCESSING Vol. 51 No. 4 2015


HIERARCHICAL CLUSTERING ALGORITHMS FOR SEGMENTATION 5

D(x) (a)
4

0 5 10 15 20 25 30 35 40 x
H1(x) (b)
4

0 5 10 15 20 25 30 35 40 x
H2(x) (c)
4

0 5 10 15 20 25 30 35 40 x
H3(x) (d)
4

0 5 10 15 20 25 30 35 40 x

Fig. 3. Example of the effect of the grid size on the separability of overlapping clusters: (a) is the dis-
tribution density of some random variable, (b)–(d) are histograms at different sampling levels.

100
2
Clustering accuracy, %

95

1
90

85

80
5 10 15 20 25 30 35 40 m

Fig. 4. Graph of the dependence of the clustering accuracy on the grid parameter m for the
Bananas model for dendrogram construction using the single linkage method (curve 1) and the
average linkage method (curve 2) in the HECA ensemble algorithm using eight grids.

OPTOELECTRONICS, INSTRUMENTATION AND DATA PROCESSING Vol. 51 No. 4 2015


6 PESTUNOV, et al.

Y (a) Y (b)
250 250

200 200

150 150

100 100

50 50

0 50 100 150 200 250 X 0 50 100 150 200 250 X

Y (c) Y (d)
250 250

200 200

150 150

100 100

50 50

0 50 100 150 200 250 X 0 50 100 150 200 250 X

Fig. 5. Experiment with the model consisting of normally distributed classes: (a) initial data;
(b)–(d) clustering results using the proposed ensemble algorithm for values of the dendrogram
aggregation parameter of 0.4, 0.6, and 0.95, respectively.

(L)
with the size of the matrix on the finest grid {h̃ij } is constructed as follows:

1  (k) (k) (k)


L
Hij = h̃ (Gi , Gj ).
L
k=1

(k)
Here Gi is a component on the grid with number k on which the representative cell of the component
(L)
Gi from the grid with number L is located. Thus, the problem of consistency of the cluster labels in the
ensemble is solved using the representatives of the connectedness components.
The resulting consistent distance matrix {Hij } may no longer have the ultrametric property. Unfor-
tunately, the search for the average ultrametric (simultaneously closest to all original ultrametrics) is an
NP-hard problem [11]. Therefore, the final hierarchical result is obtained by applying the construction den-
drogram method to the consistent distance matrix {Hij }. Maurizio [23] notes that the use of the unweighted
pair group method (UPGMA) to the arithmetic mean of the ultrametrics matrices often gives a result that
coincides with the results of time-consuming optimization methods of searching for the closest ultrametric.
The experimental studies have shown that the UPGMA method is superior to the SLINK method in the
proposed algorithm. Figure 4 presents a graph of the clustering accuracy versus the grid parameter m when
using the average linkage method and the single linkage method in the last step of the HECA algorithm.
The experimental studies confirm that the use of the ensemble approach can significantly improve the
clustering results and their stability against changes in the grid parameter. The graph in Fig. 2 demonstrates
improvement in the clustering accuracy with increasing number of grids (the grid parameter m was selected
from the set {m, m + 2, . . . , m + 2(L − 1)}).

EXPERIMENTAL RESULTS

Numerous experiments on simulated data have shown the ability of the algorithm to identify complex
structures in them. In all experiments, an ensemble of eight elements was used.

OPTOELECTRONICS, INSTRUMENTATION AND DATA PROCESSING Vol. 51 No. 4 2015


HIERARCHICAL CLUSTERING ALGORITHMS FOR SEGMENTATION 7

Y (a) Y (b)
250 250

200 200

150 150

100 100

50 50

10 10
0 50 100 150 200 250 X 0 50 100 150 200 250 X

Y (c) Y (d)
250 250

200 200

150 150

100 100

50 50

10 10
0 50 100 150 200 250 X 0 50 100 150 200 250 X

Fig. 6. Experiment with a model consisting of eight classes which differ in shape, size, and density:
(a) initial data; (b)–(d) the results of clustering by the HECA, OPTICS/DeLiClu, and SLINK,
algorithms, respectively.

Comparison of the running time of clustering algorithms (the most optimal parameters were used; time is in seconds)

Model 1 Model 2 Model 3 ALOS satellite


Algorithm RGB image
(400 points) (4000 points) (9388 points) image
(265,776 pixels)
(see Fig. 1) (see Fig. 5) (see Fig. 6)) (50 000 pixels)
HECA 0.005 0.003 0.055 0.07 0.03
k-means Lloyd 0.03 0.08 0.06 0.6 1.0
EM 0.65 18.4 1.45 43 78
DBSCAN 0.06 0.35 1.3 25 136
OPTICS 0.03 0.4 1.8 55 390
DeLiClu 0.19 0.73 2.0 35 430
SLINK 0.01 0.42 2.2 96 2752

The performance of the HECA algorithm with other algorithms was compared using the ELKI open-
source software package [24], including well-known clustering algorithms such as k-means, EM, DBSCAN,
OPTICS, DeLiClu, and SLINK.
Figure 5a show model 2 of eight normally distributed classes. The classes are united into three isolated
groups, one of which has substantial overlaps, which significantly complicates their separation. The HECA
algorithm allows one not only to effectively separate all the eight classes (Fig. 5b), but also to identify the
hierarchical structure of the data (Fig. 5b–d).
Algorithms such as the k-means and EM, designed for the separation of normally distributed classes,
successfully cope with this model only in the case of successful initialization of the centers. At the same
time, density methods (DBSCAN, OPTICS, and DeLiClu) and conventional hierarchical methods (single
linkage, long-range linkage, weighted average linkage) are not capable of separating a strongly overlapping
group of clusters. The result of their application is six or less clusters, depending on the values of the selected
parameters (see Fig. 5c, d).

OPTOELECTRONICS, INSTRUMENTATION AND DATA PROCESSING Vol. 51 No. 4 2015


8 PESTUNOV, et al.

(a) (b) (c) (d)

Fig. 7. Experiment with an RGB image: (a) original image; (b)–(d) results of clustering by the
HECA algorithm at different levels of hierarchy.

(a) (b) (c) (d)

Fig. 8. Experiment with a WorldView-2 satellite image: (a) RGB-composite (channels 5, 3, and
2) of the original image; (b)–(d) results of clustering by the hierarchical ensemble algorithm at
different levels of hierarchy.

The examined model 3, shown in Fig. 6a, consists of eight clusters that differ in shape, size and density,
including circular, spiral, and normally distributed classes [25]. When using fine grids (m ≥ 60), the HECA
ensemble algorithm successfully identifies all clusters and its accuracy reaches 99.3% (Fig. 6b). However, no
one of the algorithms from the ELKI package could accurately identify all eight clusters. Of the hierarchical
algorithms, the SLINK method showed the best results for this model: it identified spiral-shapes classes,
but involved difficulties in separating normally distributed classes (Fig. 6d). The best result (94.6%) were
achieved with the OPTICS and DeLiClu algorithms. However, they were unable to identify spiral shaped
clusters (Fig. 6c).
An important advantage of the proposed HECA algorithm is its high computational efficiency. The table
shows a comparison of the running time of the HECA algorithms and the algorithms of the ELKI software
package [24] for the above models and real images (color RGB image of 452 × 588 pixels and a fragment of a

OPTOELECTRONICS, INSTRUMENTATION AND DATA PROCESSING Vol. 51 No. 4 2015


HIERARCHICAL CLUSTERING ALGORITHMS FOR SEGMENTATION 9

four-channel image of 250 × 200 pixels obtained from the ALOS satellite). Although this package does not
guarantee the maximum speed of the algorithms included in it, it is the most efficient of the available means,
in particular, due to the possibility of using the R∗ -tree for indexing. The proposed HECA algorithm and
the ELKI complex were implemented in the Java programming language. A four-core Intel Core i7 PC with
a clock frequency of 3.2 GHz and a 8 GB RAM was used.
Figure 7 shows an example of processing of a color image of 452 × 588 pixels; the processing time was
0.03 s. An example of processing of a WorldView-2 satellite images using the HECA algorithm is presented in
Fig. 8. The clustering was performed using four channels: 2, 3, 5, and 7. The image size is 600 × 1552 pixels,
and the processing time was 0.7 s.

CONCLUSIONS

HCA and HECA hierarchical clustering algorithms for segmentation of satellite images were proposed.
The results of the experiments using model and real data confirm the high quality of the solutions obtained
and their stability against changes in configurable parameters. The possibility of constructing a hierarchical
system of embedded clusters significantly facilitates the interpretation of results. The high performance of
the algorithm allows processing multispectral images interactively.
It should be noted that the use of the grid structure imposes a restriction on the dimension of the
data being processed. The algorithms work effectively with multispectral images containing up to eight
wavelengths.
In the future, it is planned to make the algorithms suitable for application to graphics processors and to
overcome the limitations caused by the use of the grid structure.
This work was supported by the Russian Foundation for Basic Research (Grants No. 14-07-31320-say a,
No. 14-07-00249-a) and Russian Science Foundation (grant No. 14-14-00453).

REFERENCES

1. R. C. Gonzalez and R. E. Woods, Digital Image Processing (Tekhnosphera, Moscow, 2006) [Russian translation].
2. P. A. Chochia, “Image Segmentation Based on the Analysis of Distances in an Attribute Space,” Avtometriya
50 (6), 97–110 (2014) [Optoelekron. Instrum. Data Process. 50 (6), 613–624 (2014)].
3. I. A. Pestunov and Yu. N. Sinyavskii, “Clustering Algorithms in Problems of Segmentation of Satellite Images,”
Vestn. KemGU 2 (4(52)), 110–125 (2012).
4. R. Xu and D. I. Wunsch, “Survey of Clustering Algorithms,” IEEE Trans. Neural Networks 16 (3), 645–678
(2005).
5. A. K. Jain, “Data Clustering: 50 years Beyond K-Means,” Patt. Recogn. Lett. 31 (8), 651–666 (2010).
6. R. Ghaemi, M. Sulaiman, H. Ibrahim, and N. Mustapha, “A Survey: Clustering Ensembles Techniques,” World
Acad. Sci., Eng. and Technol. 3 (2), 535–544 (2009).
7. P. Hope, L. Hall, and D. Goldgof, “A Scalable Framework for Cluster Ensembles,” Patt. Recogn. 42 (5), 676–688
(2009).
8. R. Kashef and M. Kamel, “Cooperative Clustering,” Patt. Recogn. 43 (7), 2315–2329 (2010).
9. J. Jia, B. Liu, and L. Jiao, “Soft Spectral Clustering Ensemble Applied to Image Segmentation,” Front. Comput.
Sci. China. 5 (1), 66–78 (2011).
10. L. Franek and X. Jiang, “Ensemble Clustering by Means of Clustering Embedding in Vectorspaces,” Patt. Recogn.
47 (2), 833–842 (2014).
11. A. Mirzaei and M. Rahmati, “A Novel Hierarchical-Clustering-Combination Scheme Based on Fuzzy-Similarity
Relations,” IEEE Trans. Fuzzy Syst. 18 (1), 27–39 (2010).
12. L. Zheng, T. Li, and C. Ding, “Hierarchical Ensemble Clustering,” Proc. of 2010 IEEE Intern. Conf. on Data
Mining. IEEE, 1199–1204 (2010).
13. E. A. Kulikova, I. A. Pestunov, and Yu. N. Sinyavskii, “Nonparametric Clustering Algorithm for Processing
Large Data Arrays,” Proc. 14 All-Russian Conf. on Mathematical Methods of Pattern Recognition, (MAKS
Press, Moscow, 2009), pp. 149–152.
14. I. A. Pestunov, V. B. Berikov, and Yu. N. Sinyavskii, “Segmentation of Multispectral Images Based on an
Ensemble of Nonparametric Clustering Algorithms,” Vestn. SibGAU, No. 5(31), 56–64 (2010).
15. I. A. Pestunov, V. B. Berikov, E. A.‘Kulikova, and S. A. Rylov, “Ensemble Clustering Algorithm for Large
Datasets,” Avtometriya 47 (3), 49–58 (2011). [Optoelekron., Instrum. Data Process. 47 (3), 245–252 (2011)].
16. I. A. Pestunov and S. A. Rylov, “Algorithms of Spectral Texture Segmentation of Satellite Images of High Spatial
Resolution,” Vestn. KemGU 2 (4(52)), 104–110 (2012).
17. M. R. Ilango and V. Mohan, “A Survey of Grid Based Clustering Algorithms,” Intern. Journ. Eng. Sci. Technol.
2 (8), 3441–3446 (2010).

OPTOELECTRONICS, INSTRUMENTATION AND DATA PROCESSING Vol. 51 No. 4 2015


10 PESTUNOV, et al.

18. L. Yonggang, W. Yi, “PHA: A Fast Potential-Based Hierarchical Agglomerative Clustering Method,” Patt.
Recogn. 46 (5), 1227–1239 (2013).
19. B. Leclerc, “Description Combinatoire des Ultramétriques,” Math. Sci. Humaines 127 (73), 5–37 (1981).
20. St. S. Skiena, The Algorithm Design Manual, (Springer, 2008).
21. Cl. F. Olson, “Parallel Algorithms for Hierarchical Clustering,” Parallel Comput. 21 (8), 1313–1325 (1995).
22. The Matlab Toolbox for Pattern Recognition, URL: http://www.prtools.org (date Treatment: 01/28/2015).
23. V. Maurizio, “Principal Classifications Analysis a Method for Generating Consensus Dendrograms and its Ap-
plication to Three-Way Data,” Comput. Stat. & Data Anal. 27 (3), 311–331 (1998).
24. E. Achtert, H. Kriegel, E. Schubert, and A. Zimek, “Interactive Data Mining with 3D-Parallel-Coordinate-Trees,”
Proc. ACM Intern. Conf. on Management of Data (SIGMOD), New York., pp. 1009–1012 (2013).
25. S. A. Rylov, Model of Two-Dimensional Data for Clustering, URL: https://cloud.mail.ru/public/c5f33ae275a8/
TestData Rylov 2D Labelled 2472 elements.txt (the date of circulation: 01.23.2015).

OPTOELECTRONICS, INSTRUMENTATION AND DATA PROCESSING Vol. 51 No. 4 2015

View publication stats

You might also like