XLStat - Agglomerative Hierarchical Clustering (AHC)
Use agglomerative hierarchical clustering to create similar observation groups (clusters) on the basis of their description by a set of quantitative variables, binary variables (0/1), or possibly all types of variables.
XLSTAT proposes several aggregation methods:
- Ward's method (iniertia)
- Ward's method (variance)
- Complete linkage
- Simple linkage
- Strong linkage
- Flexible linkage
- Unweighted pair-group average
- Weighted pair-group average
XLSTAT proposes several similarities/dissimilarities that are suitable for a particular type of data:
For quantitative data:
| Similarity | Dissimilarity |
| Pearson's coefficient of correlation | Euclidean distance |
| Spearman's coefficient of rank correlation | Chi-square distance |
| Kendall's coefficient of rank correlation | Manhattan distance |
| Inertia | Pearson's dissimilarity |
| Covariance (n) | Spearman's dissimilarity |
| Covariance (n-1) | Kendall's dissimilarity |
| Percent agreement | Percent disagreement |
For binary data (0/1):
Similarity/Dissimilarity
Jaccards coefficient
Dice coefficient
Sokal & Sneath coefficient (2)
Rogers & Tanimoto coefficient
Simple matching coefficient
Indice de Sokal & Sneath coefficient (1)
Phi coefficient
Ochiais coefficient
Kulczinskis coefficient
Percent agreement
Note: for non-binary categorical variables, it is preferable to first perform a Multiple Correspondence Analysis (MCA) and to consider the coordinates of the observations on the factorial axes as new variables.
Stats Books | Stats Links | Delphi Book | Anglesey
Copyright © 2008 Kovach Computing Services, Anglesey, Wales. All Rights Reserved. Portions copyright Addinsoft, Provalis Research, and Data Description Inc.
Last modified 25 January, 2008
