- #1

- 58

- 0

## Main Question or Discussion Point

Hi all,

Note: The text below is the motivation for my question. To jump to the question immediately, please skip to the line that says HI!!!!

I have a set of data points, lets call it A, and I ran principal component analysis to get the top 3 principal components to be able to represent the data points as a 3D plot.

Now, I have another set of data points, lets call it B, and I want to see how B differs from A. To do so, I want to plot B along the top 3 principal components of A. However, this coordinate system may be unfair to B, because most of the variance of B may not be captured in the first 3 principal components of A. Therefore, I want to be able to measure how much of the variance of B is captured in the first 3 principal components of A. Since principal components of A may not be eigenvectors of B, I cannot take the square of eigenvalues of each corresponding principal component, as in doing PCA).

Therefore, my question is:

HI!!!!! <---- For those who have been reading this post in its entirety, please ignore this

Suppose you are given a matrix M of data points. How do you measure how much variance in the dataset is captured in a particular coordinate of M?

As an example, suppose all my points are of the form (a,1) for different values of a and all a are distinct. Then the first coordinate will capture 100% of the variance while the second coordinate will capture 0% of the variance.

Note: The text below is the motivation for my question. To jump to the question immediately, please skip to the line that says HI!!!!

I have a set of data points, lets call it A, and I ran principal component analysis to get the top 3 principal components to be able to represent the data points as a 3D plot.

Now, I have another set of data points, lets call it B, and I want to see how B differs from A. To do so, I want to plot B along the top 3 principal components of A. However, this coordinate system may be unfair to B, because most of the variance of B may not be captured in the first 3 principal components of A. Therefore, I want to be able to measure how much of the variance of B is captured in the first 3 principal components of A. Since principal components of A may not be eigenvectors of B, I cannot take the square of eigenvalues of each corresponding principal component, as in doing PCA).

Therefore, my question is:

HI!!!!! <---- For those who have been reading this post in its entirety, please ignore this

Suppose you are given a matrix M of data points. How do you measure how much variance in the dataset is captured in a particular coordinate of M?

As an example, suppose all my points are of the form (a,1) for different values of a and all a are distinct. Then the first coordinate will capture 100% of the variance while the second coordinate will capture 0% of the variance.