Is There a Linear Transformation to Map Data Set X to Y in PCA?

Click For Summary

Discussion Overview

The discussion revolves around the possibility of finding a linear transformation that maps a data set X to another data set Y through an intermediary transformation Z in the context of Principal Component Analysis (PCA). Participants explore the relationships between these transformations and their implications for dimensional reduction techniques.

Discussion Character

  • Technical explanation
  • Mathematical reasoning
  • Exploratory

Main Points Raised

  • One participant proposes that if K maps X to Z and P maps Z to Y, then a linear transformation M could be defined to map X directly to Y, expressed as M*X = Y.
  • Another participant suggests that M can be represented as the product M = P*K, indicating a direct relationship between the transformations.
  • There is a discussion about the properties of matrix multiplication, particularly that it is not commutative but is associative, which affects how transformations can be combined.
  • Concerns are raised about the implications of X not being square, which may complicate the ability to find M through simple substitution.
  • Participants discuss the importance of standardizing variables to ensure comparability in the context of PCA, particularly when different measurements have vastly different variances.

Areas of Agreement / Disagreement

There appears to be some agreement on the formulation of M as P*K, but the discussion also highlights uncertainties regarding the implications of matrix properties and the conditions under which these transformations hold. The discussion remains somewhat unresolved regarding the complexities introduced by the dimensions of X.

Contextual Notes

Participants note the potential issues with non-square matrices and the need for standardization in PCA, which may affect the applicability of the proposed transformations.

Who May Find This Useful

This discussion may be useful for those interested in linear algebra, data transformation techniques, and applications of PCA in data analysis and dimensional reduction.

Trentkg
Messages
3
Reaction score
0
This question broadly relates to principle component analysis (PCA)

Say you have some data vector X, and a linear transformation K that maps X to some new data vector Z:

K*X → Z

Now say you have another linear transformation P that maps Z to a new data vector Y:

P*Z → Y

is there a linear transformation, call it M, that maps X to Y?

M*X → Y?

If Y, Z, P and X are known, can we solve for M? I would think we could find M by simple substitution...

M*X → Y,
P*Z → Y,
M → X^-1*P*Z = X^-1 (P*K*X) ?

We'll run into serious problems here if X is not square.

WHY I'M ASKING THIS QUESTION AND HOW IT RELATED TO PCA:

Without going into too much detail, PCA is a dimensional reduction technique. It seeks to find a Linear transformation P that maps Z to Y, such that the matrix Cy, defined as:

Cy == 1/n Y*Y^T is diagonalized.

Cy is the covariance matrix--the diagonal terms represent the covariance of the system while the off diagonal terms represent the variance (To see why this is true, write Y as an MxN matrix with elements i->j. If these elements are mean Zero, what does element 1/n Yi x Yj look like? What about 1/n Yi x Yi? )When Cy is diagonlized, the diagonal terms (variance of the system) is maximized, while the off diagonal terms (the covariance of the system) are minimized (set to zero). Y is Z in an ew basis, with the highest variance of the system is aligned along the first eigenvector, the second highest variance of the system alinged along the second, so on and so forth. The idea is if there are 20 measurements in a ssytem, yet you can express 99% of the variance of the system in only the first 4 eigenvectors, then your 20 dimensional system can probably be reduced to 4. ( a better explanation can be found here: http://www.snl.salk.edu/~shlens/pca.pdf )

Usually X is setup as an mxn matrix where m is the number of different measurements and n the number of trials. The first transformation, K, could be standardization/normlization, or changing units. The fear is that the variance of one measurement will dominate. If m1 has variance 1, and m2 variance 10000, then m2 will dominate the covariance matrix even if m1 is in units of cm and m2 units of km. Hence, we must standardize the variables so they are comparable with a map K.

The problem, then, is that the transformation taking eigenvalues that map Z to Y are in terms of the data set Z. Data set Z may not be of any interest to the experimenter (in this case, ME!). I'm interested in what the eigenvectors/Principle components are of data set X!


Anyways, thanks for any help!
 
Last edited by a moderator:
Physics news on Phys.org
M=P*K of course!
 
Erland said:
M=P*K of course!

M = X^-1 (P*K)X

so you're saying

X^-1 (P*K)X = (P*K)X^-1 *X ?

My Linear algbra is a little rusty, but isn't matrix multiplication not communative?
 
K*X=Z, P*Z=Y. Hence, Y=P*Z=P*(K*X)=(P*K)*X, so we can set M=P*K. Matrix multiplication is not commutative, but it is associative.
 
Ah yes! Of course, how simple. Thank you erland!
 

Similar threads

  • · Replies 1 ·
Replies
1
Views
2K
  • · Replies 1 ·
Replies
1
Views
1K
  • · Replies 5 ·
Replies
5
Views
2K
  • · Replies 4 ·
Replies
4
Views
3K
  • · Replies 5 ·
Replies
5
Views
2K
  • · Replies 10 ·
Replies
10
Views
2K
  • · Replies 4 ·
Replies
4
Views
2K
  • · Replies 1 ·
Replies
1
Views
4K
  • · Replies 8 ·
Replies
8
Views
3K
  • · Replies 8 ·
Replies
8
Views
3K