Principal Components and the Residual Matrix

In summary, the conversation discusses the topic of principal components and residual matrices. The speaker has created a fake dataset and calculated the principal components, but is only able to get a residual matrix of 0 when the data is standardized. This leads to the question of whether standardizing the data is necessary for this to hold. The conversation also touches on the importance of considering eigen-values and using the inverse of the PCA matrix for transformations.
  • #1
Jrb599
24
0
I've been reading about principal components and residual matrixs.

It's my understanding if you used every principal component to recalculate your orginal data, then the residual matrix should be 0.

Therefore, I created a fake dataset of two random variables and calculated the principal components.


When I do eigenvector1,1*princomp1,1+Eigenvector1,2*princomp1,2 = var 1
similarly
When I do eigenvector2,1*princomp2,1+Eigenvector2,2*princomp2,2 = var 2

so therefore the residual matrix is 0 which is what I wanted. However, this is only true when I standardize the data.

If I don't standardized the data, the two formulas I listed above aren't true.

What is throwing me for a loop is none of the papers I read said anything about standardizing the data, but it looks like the data must be standardized for this to hold. I don't want to make any assumptions so I thought I would ask. Is this correct?
 
Physics news on Phys.org
  • #2
Hey Jrb599 and welcome to the forums.

Did you take into account the eigen-values for the principal component eigen-vectors?

The eigen-values represent the variance component which is related the un-standardized random variables' variance attributes.
 
  • #3
Hi Chiro,

Thanks for the response. Yeah I've taken the eigenvalues into account, and I still can't get it to work
 
  • #4
Just out of curiosity, what eigen-values do you get from PCA for the standardized data? Are they unit length?

Also you should calculate the PCA matrix and get its inverse to go from PCA space to original space since the PCA is a linear transformation from original space to new space.

Try this to get the original random variables if you are in the initial PCA space.
 
  • #5
Chiro - I realized the program I was using was still doing mean-centering. It's working now
 

1. What is the purpose of performing Principal Component Analysis (PCA)?

PCA is a statistical method used to reduce the dimensionality of a dataset by identifying the most important features, or principal components, that explain the majority of the variance in the data. This allows for easier interpretation and visualization of the data.

2. How do you interpret the principal components and the residual matrix?

The principal components represent linear combinations of the original features, with each component capturing a different aspect of the data's variance. The residual matrix contains the remaining variation in the data that is not explained by the principal components.

3. Can PCA be used for feature selection?

Yes, PCA can be used as a feature selection technique by selecting the top principal components that explain a significant amount of the data's variance. This can help to reduce the number of features without losing important information.

4. What is the relationship between principal components and eigenvalues?

Each principal component is associated with an eigenvalue, which represents the amount of variance in the data that is explained by that component. The higher the eigenvalue, the more important the corresponding principal component is in explaining the data's variance.

5. Are there any assumptions or limitations when using PCA?

PCA assumes that the data follows a linear relationship and that the variables are normally distributed. It also works best with numerical data, so categorical variables may need to be transformed. Additionally, PCA may not be suitable for datasets with high levels of multicollinearity or outliers.

Similar threads

  • Linear and Abstract Algebra
Replies
1
Views
2K
  • Linear and Abstract Algebra
Replies
9
Views
4K
  • Set Theory, Logic, Probability, Statistics
Replies
1
Views
1K
  • Linear and Abstract Algebra
Replies
17
Views
4K
Replies
3
Views
1K
  • Linear and Abstract Algebra
Replies
2
Views
1K
Replies
2
Views
1K
  • Programming and Computer Science
Replies
5
Views
2K
  • Set Theory, Logic, Probability, Statistics
Replies
29
Views
6K
  • Precalculus Mathematics Homework Help
2
Replies
57
Views
2K
Back
Top