Professional Documents
Culture Documents
Full Pattern Cluster Poster PDF
Full Pattern Cluster Poster PDF
Introduction:
Modern X-ray diffraction equipment like X'Pert PRO systems with an X'Celerator detector allows the rapid collection of hundreds of scans in a very short time. This can be useful in
application areas like polymorph screening, high throughput screening, non-ambient experiments and more.
Here we present a method that greatly simplifies the analysis of large amounts of data by automatically sorting all scans of an experiment into closely related clusters, identifying
the most representative scan of each cluster and outlying patterns.
Full-pattern cluster analysis is a new feature added to the FDA Part 11 compliant X-ray powder diffraction analysis software packages X'Pert HighScore/HighScore Plus.
This software comprises several full-pattern analysis methods like search/match phase identification, quantitative Rietveld analysis and crystallinity determination plus an exhaustive
range of pattern treatment methods and a complete report generation in RTF format or as MS Word documents. All methods can be used in an automated way (pushbutton or
command line) in any sequence and with user definable parameters.
The implemented cluster analysis method can basically be seen as an automatic 3-step process, but additional visualization tools are present to judge and influence the clustering
based on dendrograms, histograms and score plots derived from principal component analysis.
Cut-Off: 194.28
Penalty Function
Cut-Off: 6.22
spectra. The minimum KGS value
27
Cut-Off: 8.14
26.5
26
25.5
Cut-Off: 9.43
First all measured data sets (which could also represents the state, where the clusters 25
24.5
Cut-Off: 10.91
24
Cut-Off: 11.17
include peaks information) are reduced to as a dendrogram (figure 2). are as highly populated as possible, 23.5
23
Cut-Off: 12.13
22.5
22
Cut-Off: 13.05
probability curves ui(x). The match is carried Each pattern starts at the left side as a whilst simultaneously maintaining the 21.5
21
Cut-Off: 14.99
20.5
out by a direct comparison between ui(x) separate cluster, and these clusters smallest spread. 20
Cut-Off: 17.13
Cut-Off: 164.64
19.5
Penalty Value
19
Cut-Off: 17.88
18.5
Cut-Off: 22.49
18
17.5
Cut-Off: 28.92
17
(FOM's) are calculated to indicate the vertical tie bars. The horizontal position of The most representative data set is
Cut-Off: 34.46
16.5
16
Cut-Off: 149.84
Cut-Off: 39.85
15.5
15
similarity between the data sets. All the tie bar represents a dissimilarity defined as the data set that has the
Cut-Off: 42.92
14.5
14
Cut-Off: 143.14
Cut-Off: 46.13
13.5
Cut-Off: 58.70
indicators are calculated for the overlapping measure. minimum mean distance from all other 13
Cut-Off: 130.37
12.5
Cut-Off: 96.37
12
11.5
range of the compared data sets and are data sets in a given cluster. 11
10.5
10
References:
1) Kelley, L.A., Gardner, S.P., Sutcliffe, M.J. (1996) An automated approach
for clustering an ensemble of NMR-derived protein structures into
conformationally-related subfamilies, Protein Engineering, 9, 1063-1065.