- #1
Jamin2112
- 986
- 12
So MathWorks.com shows this as an example:
d = pdist(meas);
Z = linkage(d);
c = cluster(Z,'maxclust',3:5);
http://www.mathworks.com/help/stats/cluster.html.
I'm confused about why the routine gives any useful information. First it returns the Euclidean distances between values in some array meas. Then it performs hierarchal clustering on those distances. How is that useful? If I had a vector (0, 1, 50, 99, 100), then the distances are |0-1|=1, |0-50|=50, |0-99|=99, |0-100|=100, |1-50|=49, |1-99|=98, |1-100| = 99, |50-99|=49, |50-100|=50, |99-100| = 1. So I'm then clustering the values 1, 50, 99, 100, 49, 98, 99, 49, 50, 1. If I tell it to form a max of 3 clusters, the clusters will probably be (1,1), (49, 49, 50), and (98, 99, 99, 100). The first cluster is corresponding to the distances between 0 and 1, and between 99, 100. So that means I'm clustering the values 0, 1, 99, and 100 together.
Or am I totally not understanding this?
d = pdist(meas);
Z = linkage(d);
c = cluster(Z,'maxclust',3:5);
http://www.mathworks.com/help/stats/cluster.html.
I'm confused about why the routine gives any useful information. First it returns the Euclidean distances between values in some array meas. Then it performs hierarchal clustering on those distances. How is that useful? If I had a vector (0, 1, 50, 99, 100), then the distances are |0-1|=1, |0-50|=50, |0-99|=99, |0-100|=100, |1-50|=49, |1-99|=98, |1-100| = 99, |50-99|=49, |50-100|=50, |99-100| = 1. So I'm then clustering the values 1, 50, 99, 100, 49, 98, 99, 49, 50, 1. If I tell it to form a max of 3 clusters, the clusters will probably be (1,1), (49, 49, 50), and (98, 99, 99, 100). The first cluster is corresponding to the distances between 0 and 1, and between 99, 100. So that means I'm clustering the values 0, 1, 99, and 100 together.
Or am I totally not understanding this?