Is There a Linear Transformation to Map Data Set X to Y in PCA?

In summary, the question asks if there is a linear transformation that maps X to Y, given that P and K are known. The answer is yes, provided that X is square.
  • #1
Trentkg
3
0
This question broadly relates to principle component analysis (PCA)

Say you have some data vector X, and a linear transformation K that maps X to some new data vector Z:

K*X → Z

Now say you have another linear transformation P that maps Z to a new data vector Y:

P*Z → Y

is there a linear transformation, call it M, that maps X to Y?

M*X → Y?

If Y, Z, P and X are known, can we solve for M? I would think we could find M by simple substitution...

M*X → Y,
P*Z → Y,
M → X^-1*P*Z = X^-1 (P*K*X) ?

We'll run into serious problems here if X is not square.

WHY I'M ASKING THIS QUESTION AND HOW IT RELATED TO PCA:

Without going into too much detail, PCA is a dimensional reduction technique. It seeks to find a Linear transformation P that maps Z to Y, such that the matrix Cy, defined as:

Cy == 1/n Y*Y^T is diagonalized.

Cy is the covariance matrix--the diagonal terms represent the covariance of the system while the off diagonal terms represent the variance (To see why this is true, write Y as an MxN matrix with elements i->j. If these elements are mean Zero, what does element 1/n Yi x Yj look like? What about 1/n Yi x Yi? )When Cy is diagonlized, the diagonal terms (variance of the system) is maximized, while the off diagonal terms (the covariance of the system) are minimized (set to zero). Y is Z in an ew basis, with the highest variance of the system is aligned along the first eigenvector, the second highest variance of the system alinged along the second, so on and so forth. The idea is if there are 20 measurements in a ssytem, yet you can express 99% of the variance of the system in only the first 4 eigenvectors, then your 20 dimensional system can probably be reduced to 4. ( a better explanation can be found here: http://www.snl.salk.edu/~shlens/pca.pdf )

Usually X is setup as an mxn matrix where m is the number of different measurements and n the number of trials. The first transformation, K, could be standardization/normlization, or changing units. The fear is that the variance of one measurement will dominate. If m1 has variance 1, and m2 variance 10000, then m2 will dominate the covariance matrix even if m1 is in units of cm and m2 units of km. Hence, we must standardize the variables so they are comparable with a map K.

The problem, then, is that the transformation taking eigenvalues that map Z to Y are in terms of the data set Z. Data set Z may not be of any interest to the experimenter (in this case, ME!). I'm interested in what the eigenvectors/Principle components are of data set X!


Anyways, thanks for any help!
 
Last edited by a moderator:
Physics news on Phys.org
  • #2
M=P*K of course!
 
  • #3
Erland said:
M=P*K of course!

M = X^-1 (P*K)X

so you're saying

X^-1 (P*K)X = (P*K)X^-1 *X ?

My Linear algbra is a little rusty, but isn't matrix multiplication not communative?
 
  • #4
K*X=Z, P*Z=Y. Hence, Y=P*Z=P*(K*X)=(P*K)*X, so we can set M=P*K. Matrix multiplication is not commutative, but it is associative.
 
  • #5
Ah yes! Of course, how simple. Thank you erland!
 

1. What is a consecutive linear map?

A consecutive linear map, also known as a linear transformation, is a mathematical function that maps one vector space to another in a way that preserves the structure of the space. In simpler terms, it is a function that takes in a vector and outputs another vector, while maintaining the properties of the original vector space.

2. How is a consecutive linear map different from a regular linear map?

A consecutive linear map is a sequence of multiple linear transformations performed one after the other, while a regular linear map is a single linear transformation. In other words, a consecutive linear map is a composition of linear maps, while a regular linear map is a single function.

3. What is the purpose of consecutive linear maps in PCA?

Consecutive linear maps are used in Principal Component Analysis (PCA) to reduce the dimensionality of a dataset. The consecutive linear maps are used to transform the original dataset into a new coordinate system, where the first axis captures the most variation in the data, the second axis captures the second most variation, and so on. This allows for easier visualization and analysis of the data.

4. How do you determine the number of consecutive linear maps needed for PCA?

The number of consecutive linear maps used in PCA is determined by the number of principal components desired. In general, the number of principal components should be less than or equal to the number of original dimensions in the dataset. The number of consecutive linear maps will then be equal to the number of principal components.

5. Can consecutive linear maps be reversed in PCA?

Yes, consecutive linear maps can be reversed in PCA. This means that the data can be transformed back to its original coordinate system after dimensionality reduction. However, the reversed consecutive linear maps may not be exactly the same as the original consecutive linear maps due to the loss of information during dimensionality reduction.

Similar threads

  • Linear and Abstract Algebra
Replies
1
Views
928
Replies
5
Views
879
  • Linear and Abstract Algebra
Replies
1
Views
1K
  • Linear and Abstract Algebra
Replies
4
Views
2K
  • Linear and Abstract Algebra
Replies
1
Views
1K
  • Linear and Abstract Algebra
Replies
8
Views
1K
  • Linear and Abstract Algebra
Replies
10
Views
1K
  • Linear and Abstract Algebra
Replies
5
Views
1K
Replies
4
Views
1K
  • Linear and Abstract Algebra
Replies
6
Views
880
Back
Top