Hierarchical Clustering: Ward linkage

  • Thread starter Thread starter eoghan
  • Start date Start date
  • Tags Tags
    Linkage
eoghan
Messages
201
Reaction score
7
Hi there!

The Ward linkage method in agglomerative hierarchical clustering computes the distance between two clusters using the within-group variance, which results in a weighted squared distance between cluster centers. Therefore, Ward linkage method doesn't rely on the distances of single elements, so it should be independent on the metric (euclidean , manhattan, squared euclidean...) used to compute the distance among the elements, because in the end the linkage criterion is based on the variance of the clusters which has a definite formula independently from the chosen metric. Nonetheless if I try the hclust function in R I obtain different results depending on the distance metric among the elements. Why?
Thank you
 
Physics news on Phys.org
eoghan said:
the variance of the clusters which has a definite formula independently from the chosen metric.

Why do you think the variance of the clusters is independent of the metric? What formula are you talking about?

The variance of a random variable representing a distance won't remain the same number if you change the units of measure. A quantity such as the "z-score" of a random variable representing a distance would remain numerically the same if the units of distance are changed, but it wouldn't necessarily remain the same if you switch from using euclidean distance to Manhattan distance.

I notice the Wikipedia article http://en.wikipedia.org/wiki/Ward's_method has a caution about using the correct arguments in the R programming language.
 
Namaste & G'day Postulate: A strongly-knit team wins on average over a less knit one Fundamentals: - Two teams face off with 4 players each - A polo team consists of players that each have assigned to them a measure of their ability (called a "Handicap" - 10 is highest, -2 lowest) I attempted to measure close-knitness of a team in terms of standard deviation (SD) of handicaps of the players. Failure: It turns out that, more often than, a team with a higher SD wins. In my language, that...
Hi all, I've been a roulette player for more than 10 years (although I took time off here and there) and it's only now that I'm trying to understand the physics of the game. Basically my strategy in roulette is to divide the wheel roughly into two halves (let's call them A and B). My theory is that in roulette there will invariably be variance. In other words, if A comes up 5 times in a row, B will be due to come up soon. However I have been proven wrong many times, and I have seen some...
Back
Top