- #1
noplacebos
- 3
- 0
Greetings,
I have a quick question that could be trivial, but I am scratching my head for weeks now without being able to find anything concrete in books/papers/web.
I have completed a partitional clustering of a dataset (vectors) using Root Mean Square Deviation as my distance metric. Leaving all other details on the clustering method aside, data points were being assigned to a cluster if their RMSD to the cluster's representative was below a predefined threshold. Let's say that this threshold is 1.
My question is on the resulting clusters' shape, and the general properties of RMSD as a proximity function. Are my clusters spherical (globular) because of this RMSD threshold, do they actually have a radius of 1? Can I assume by default than any two data points within a cluster will have a pairwise RMSD of less than 2 (diameter)?
I am telling myself that as RMSD doesn't exactly reflect the sum of euclidean distances between two vectors, the true cluster's shape may lie in some multidimensional space. If I apply a Multidimensional Scaling down to 3 or 2 dimensions, should I be expecting a globular cluster shape, this time? Does it depend on the nature of the initial vectors (eg. number of parameters)?
Please excuse all these questions but I remain very confused on this matter. Any pointing to a direction would be most helpful.
Thank you very much for your time.
I have a quick question that could be trivial, but I am scratching my head for weeks now without being able to find anything concrete in books/papers/web.
I have completed a partitional clustering of a dataset (vectors) using Root Mean Square Deviation as my distance metric. Leaving all other details on the clustering method aside, data points were being assigned to a cluster if their RMSD to the cluster's representative was below a predefined threshold. Let's say that this threshold is 1.
My question is on the resulting clusters' shape, and the general properties of RMSD as a proximity function. Are my clusters spherical (globular) because of this RMSD threshold, do they actually have a radius of 1? Can I assume by default than any two data points within a cluster will have a pairwise RMSD of less than 2 (diameter)?
I am telling myself that as RMSD doesn't exactly reflect the sum of euclidean distances between two vectors, the true cluster's shape may lie in some multidimensional space. If I apply a Multidimensional Scaling down to 3 or 2 dimensions, should I be expecting a globular cluster shape, this time? Does it depend on the nature of the initial vectors (eg. number of parameters)?
Please excuse all these questions but I remain very confused on this matter. Any pointing to a direction would be most helpful.
Thank you very much for your time.