# Apliying PCA to two correlated stochastic processes

• A
• Frank Einstein
In summary, the PCA analysis in Matlab fails when trying to analyze data in the direction UV. However, the same analysis is successful when done in the direction cat(3,U,V).

#### Frank Einstein

Hello everyone, I have two matrices of size 9*51, meaning that I have 51 measurements of a stochastic process measured at 9 times, being precise, it is wind speed in the direction X, I have the same data for the direction Y. I am aware that both stochastic processes are not independent, so I would like to use PCA over both of them at the same time. My software of choice is Matlab.

I cam perform the PCA analysis simply as:
[coeffU, scoreU, latentU, tsquaredU, explainedU, muU]=pca(U,'Centered',false), however, if I try to execute it over UV, being UV equal to cat(3,U,V), it doesn't work.

Can anyone tell me if there is a way of finding the joint main directions of variation instead of having to compute each one apart?

I am very familiar with MATLAB and with PCA, but not with MATLAB's PCA. (I use the SVD functions.) Which toolbox is PCA in?

I ask because I'm not sure if you have small syntactic problem with the PCA function or a deeper problem (which is being exposed by the difficulties with the PCA function) in the formulation of the question. For instance, often, MATLAB's matrix factorization functions refuse to work with 3-D arrays, so CAT(3,...) would always fail.

• Frank Einstein
JMz said:
I am very familiar with MATLAB and with PCA, but not with MATLAB's PCA. (I use the SVD functions.) Which toolbox is PCA in?

I ask because I'm not sure if you have small syntactic problem with the PCA function or a deeper problem (which is being exposed by the difficulties with the PCA function) in the formulation of the question. For instance, often, MATLAB's matrix factorization functions refuse to work with 3-D arrays, so CAT(3,...) would always fail.

Second, this has all the info regarding Matlab PCA that I am aware of: https://fr.mathworks.com/help/stats/pca.html

And third, I am posting this because I know that Mathematica can do this (see attached images). I have tried to calculate the same on Matlab since I don't want to switch between programs.

https://ibb.co/fUGAjo
This is the attempt in Matlab

https://ibb.co/eYBPc8
And this is in Mathematica.

You are right in the fact that cat(3,...) doesn't work, so that's my problem. Are you aware of any kind of solution to circumvent it and do as I have done on Mathematica.

Thanks again

## 1. What is PCA and how does it work?

PCA stands for Principal Component Analysis and it is a statistical method used to reduce the dimensionality of a dataset while retaining most of its variability. It works by finding the directions of maximum variance in a dataset and projecting the data onto those directions, resulting in a new set of uncorrelated variables called principal components.

## 2. How can PCA be applied to two correlated stochastic processes?

PCA can be applied to two correlated stochastic processes by first transforming the data into a covariance matrix, which represents the correlation between the two processes. Then, PCA is used to find the eigenvectors and eigenvalues of the covariance matrix, which are the principal components. The data can then be projected onto the principal components, resulting in a new set of uncorrelated variables.

## 3. What are the advantages of using PCA for two correlated stochastic processes?

The main advantage of using PCA for two correlated stochastic processes is dimensionality reduction. By reducing the number of variables, it becomes easier to interpret the data and make predictions. Additionally, PCA can also help identify the most important features or patterns in the data, which can be useful for further analysis.

## 4. Are there any limitations to using PCA for two correlated stochastic processes?

One limitation of using PCA for two correlated stochastic processes is that it assumes a linear relationship between the variables. If the relationship is non-linear, PCA may not be the most appropriate method to use. PCA can also be sensitive to outliers in the data, so it may not be suitable for datasets with extreme values.

## 5. How can the results of PCA be interpreted for two correlated stochastic processes?

The results of PCA can be interpreted by looking at the eigenvalues and eigenvectors of the covariance matrix. The eigenvalues represent the amount of variance explained by each principal component, while the eigenvectors represent the direction of maximum variance. The first few principal components with the highest eigenvalues can be considered the most significant and can be used for further analysis or modeling.