Variance captured in coordinate axis.

In summary, The conversation discussed the topic of measuring how much of the variance of a dataset is captured in a particular coordinate. The motivation for this question was to compare two different sets of data points and plot them along the same coordinate system. However, it was noted that this may be unfair to one set of data if most of its variance is not captured in the first 3 principal components of the other set. The question was then posed on how to measure this variance in a specific coordinate. The example provided showed that the first coordinate captured 100% of the variance while the second coordinate captured 0%. The suggested solution was to consider the coordinates separately and form random variables for each coordinate.
  • #1
simpleton
58
0
Hi all,

Note: The text below is the motivation for my question. To jump to the question immediately, please skip to the line that says HI!

I have a set of data points, let's call it A, and I ran principal component analysis to get the top 3 principal components to be able to represent the data points as a 3D plot.

Now, I have another set of data points, let's call it B, and I want to see how B differs from A. To do so, I want to plot B along the top 3 principal components of A. However, this coordinate system may be unfair to B, because most of the variance of B may not be captured in the first 3 principal components of A. Therefore, I want to be able to measure how much of the variance of B is captured in the first 3 principal components of A. Since principal components of A may not be eigenvectors of B, I cannot take the square of eigenvalues of each corresponding principal component, as in doing PCA).

Therefore, my question is:

HI! <---- For those who have been reading this post in its entirety, please ignore this

Suppose you are given a matrix M of data points. How do you measure how much variance in the dataset is captured in a particular coordinate of M?

As an example, suppose all my points are of the form (a,1) for different values of a and all a are distinct. Then the first coordinate will capture 100% of the variance while the second coordinate will capture 0% of the variance.
 
Physics news on Phys.org
  • #2
As in your example: consider the coordinates separately and form random variables for each coordinate.
 

1. What is variance captured in coordinate axis?

Variance captured in coordinate axis refers to the amount of variability or spread of data points in relation to the mean when plotted on a coordinate axis. It is a statistical measure that helps to understand the distribution of a dataset.

2. How is variance captured in coordinate axis calculated?

Variance captured in coordinate axis is calculated by first finding the mean of the data points, then subtracting each data point from the mean and squaring the result. These squared differences are then summed and divided by the total number of data points.

3. What does a high variance captured in coordinate axis indicate?

A high variance captured in coordinate axis indicates a large spread of data points around the mean, suggesting a wide range of values in the dataset. This can also suggest that the data points are not closely clustered together and may be more dispersed.

4. How does variance captured in coordinate axis relate to standard deviation?

Variance captured in coordinate axis and standard deviation are closely related, as they both measure the variability of a dataset. The square root of the variance is equal to the standard deviation, and they both provide a measure of how much the data points deviate from the mean.

5. Can variance captured in coordinate axis be negative?

No, variance captured in coordinate axis cannot be negative. It is always a positive value, as it is calculated by squaring the differences between data points and the mean. A negative value would indicate that the data points are closer to the mean than expected.

Similar threads

  • Linear and Abstract Algebra
Replies
1
Views
730
  • Astronomy and Astrophysics
Replies
5
Views
2K
  • Set Theory, Logic, Probability, Statistics
Replies
9
Views
1K
Replies
1
Views
1K
  • Linear and Abstract Algebra
Replies
4
Views
2K
  • Linear and Abstract Algebra
Replies
16
Views
4K
Replies
3
Views
1K
  • Set Theory, Logic, Probability, Statistics
Replies
6
Views
1K
  • Set Theory, Logic, Probability, Statistics
Replies
6
Views
2K
  • Linear and Abstract Algebra
Replies
6
Views
2K
Back
Top