(Sequential) Principal Component Analysis

In summary: N(0,CC^T+R)}{N(0,CC^T+R)} \times \frac{N(0,CC^T+R)}{N(0,CC^T+R)|_y} = 1 \times \frac{N(0,CC^T+R)}{N(0,CC^T+R)|_y}Therefore, we can simplify the equation as: \frac{N(Cx,R)|_yN(0,I)|_x}{N(0,CC^T+R)|_y} = N(Cx,R) \times \beta \times \frac{N(0,CC^T+R)}{N
  • #1
javicm
1
0
Hello everybody!

Any of you have read the paper "EM algorithms for PCA and SPCA" of Sam Roweis? If so, maybe you could help me! :-) I have some problems with equation 2b, well, in fact my problem is that I don't manage to deduce it. If you didn't read the paper but are curious, how could you prove that
[tex]\frac{N(Cx,R)|_yN(0,I)|_x}{N(0,CC^T+R)|_y}[/tex] follows this distribution: [tex]N(\beta y, I-\beta C)|_x[/tex], where [tex]N(A,B)|_c[/tex] means a normal (Gaussian) distribution with mean A, covariance matrix B and evaluated at c, and [tex]\beta = C^T(CC^T+R)^{-1}[/tex].

Thanks a lot!
Javier
 
Physics news on Phys.org
  • #2


Hello Javier,

I have not read the paper "EM algorithms for PCA and SPCA" by Sam Roweis, but I am familiar with the topic of EM algorithms. In order to prove the distribution you mentioned, we can use the properties of normal distributions and the definition of conditional probability.

First, let's rewrite the equation as follows:

\frac{N(Cx,R)|_yN(0,I)|_x}{N(0,CC^T+R)|_y} = N(\beta y, I-\beta C)|_x

We can expand the numerator using the definition of conditional probability:

N(Cx,R)|_yN(0,I)|_x = N(Cx,R) \times \frac{N(0,I)|_x}{N(0,CC^T+R)|_y}

Then, using the properties of normal distributions, we can rewrite the numerator as follows:

N(Cx,R)|_yN(0,I)|_x = N(Cx,R) \times \frac{N(Cx,CC^T+R)}{N(0,CC^T+R)|_y}

We can simplify the fraction by using the definition of \beta:

\frac{N(Cx,CC^T+R)}{N(0,CC^T+R)|_y} = \frac{N(Cx,CC^T+R)}{N(0,CC^T+R)} \times \frac{N(0,CC^T+R)}{N(0,CC^T+R)|_y} = \beta \times \frac{N(0,CC^T+R)}{N(0,CC^T+R)|_y}

Substituting this back into the original equation, we get:

\frac{N(Cx,R)|_yN(0,I)|_x}{N(0,CC^T+R)|_y} = N(Cx,R) \times \beta \times \frac{N(0,CC^T+R)}{N(0,CC^T+R)|_y}

Using the properties of normal distributions again, we can rewrite the fraction as:

\frac{N(0,CC^T+R)}{N(0,CC^T+R)|_y} = \frac{
 

1. What is Principal Component Analysis (PCA)?

Principal Component Analysis is a statistical method used to reduce the dimensionality of a dataset while preserving as much of the original variation as possible. It does this by transforming the original variables into a new set of variables, called principal components, which are linear combinations of the original variables.

2. How does PCA work?

PCA works by finding the directions of maximum variation in the data and projecting the data onto these directions, known as principal components. The first principal component explains the most variation in the data, followed by the second principal component, and so on. This allows for a reduction in the number of variables needed to represent the data while still capturing most of the original variation.

3. What is the purpose of PCA?

The main purpose of PCA is to simplify complex data by reducing its dimensionality while still retaining the most important information. This can be useful for data visualization, data compression, and feature extraction in machine learning and data analysis tasks.

4. What is the difference between PCA and Sequential PCA?

PCA and Sequential PCA (SPCA) are both methods of dimensionality reduction, but SPCA takes into account the sequential nature of the data. This means that SPCA considers the order of the observations in the dataset, while PCA does not. SPCA is often used for time series data, where the order of the data points is important.

5. What are some potential drawbacks of using PCA?

One potential drawback of using PCA is that it can be difficult to interpret the resulting principal components, especially if there are a large number of variables in the original dataset. Additionally, if the data is not normally distributed, PCA may not be the most appropriate method of dimensionality reduction. Furthermore, PCA may not always capture all of the important information in the data, as it only considers the variation in the data and not the relationships between variables.

Similar threads

Replies
10
Views
3K
  • Calculus and Beyond Homework Help
Replies
5
Views
3K
  • Engineering and Comp Sci Homework Help
Replies
1
Views
1K
  • Topology and Analysis
Replies
2
Views
3K
  • Linear and Abstract Algebra
Replies
10
Views
11K
Replies
2
Views
7K
Replies
2
Views
3K
  • Science and Math Textbooks
Replies
7
Views
6K
Back
Top