Hierarchical Clustering: Ward linkage

    Hi there!

    The Ward linkage method in agglomerative hierarchical clustering computes the distance between two clusters using the within-group variance, which results in a weighted squared distance between cluster centers. Therefore, Ward linkage method doesn't rely on the distances of single elements, so it should be independent on the metric (euclidean , manhattan, squared euclidean...) used to compute the distance among the elements, because in the end the linkage criterion is based on the variance of the clusters which has a definite formula independently from the chosen metric. Nonetheless if I try the hclust function in R I obtain different results depending on the distance metric among the elements. Why?
    Thank you
    Stephen Tashi

    Why do you think the variance of the clusters is independent of the metric? What formula are you talking about?

    The variance of a random variable representing a distance won't remain the same number if you change the units of measure. A quantity such as the "z-score" of a random variable representing a distance would remain numerically the same if the units of distance are changed, but it wouldn't necessarily remain the same if you switch from using euclidean distance to Manhattan distance.

    I notice the Wikipedia article http://en.wikipedia.org/wiki/Ward's_method has a caution about using the correct arguments in the R programming language.
