PCA principal component analysis standardized data

cutesteph · Feb 3, 2015

Why is better to use the standardized data using the correlation matrix than say converting data into just similar units. Like say I had data that measured car speeds measured in seconds for some data and the other data measured in minutes. Why would it be better just to measure the data using the correlation matrix to normalize data than to just covert all the times to say meters traveled per second.

mathman · Feb 3, 2015

Why would it be better just to measure the data using the correlation matrix to normalize data than to just covert all the times to say meters traveled per second.

I don't understand this sentence, but in general data analysis requires all data to have the same units.

cutesteph · Feb 3, 2015

I mean like say we are looking are car race data like from 1/4 a mile 1 mile are in seconds, while data for a 10 mile and a 50 mile race are in minutes. Can't you use normalize data using the correlation matrix within each group like 1/4 mile race even though it is in seconds to a 10 mile race even though it is in minutes? My professor analyzed data that way in a lecture and compared it to a method to just covert all units to meters per second and just take the covariance matrix of that.

Stephen Tashi · Feb 3, 2015

cutesteph said:

Why would it be better just to measure the data using the correlation matrix to normalize data than to just covert all the times to say meters traveled per second.

What is your definition of "normalizing" the data? Does it amount to replacing the data on each axis by the "z-score" of the data?

cutesteph · Feb 3, 2015

Yes. It would be which would be equivalent to using the correlation matrix in lieu of the covariance matrix for PCA. I just not sure exactly why would it be better to use that method than to just chance the units to the same units of say in my example meters per second for each different race length.

Stephen Tashi · Feb 3, 2015

cutesteph said:

I just not sure exactly why would it be better to use that method than to just chance the units to the same units of say in my example meters per second for each different race length.

We'd have to define what "better" means mathematically to investigate that question.

Perhaps the professor was illustrating that you can get different answers if you convert units and do PCA than if you do PCA and convert the units in the principal components afterwards. That difference doesn't mean that one way is always better or worse than the other.

PCA principal component analysis standardized data

Thread 'Deductive proof in logic formal systems'

Thread 'Onto set mapping is the surjective set mapping, and into injective?'

Similar threads

Hot Threads

B A Little Probability Puzzle

I Need help solving this Existence Algorithm for truth

I Stochastic calculus: Ito's lemma and differentials

I Help me understand skewness in QQ-plots please

I Intransitive implication

Recent Insights

Insights Why Entangled Photon-Polarization Qubits Violate Bell’s Inequality

Insights Quantum Entanglement is a Kinematic Fact, not a Dynamical Effect

Insights What Exactly is Dirac’s Delta Function? - Insight

Insights Relativator (Circular Slide-Rule): Simulated with Desmos - Insight

Insights Fixing Things Which Can Go Wrong With Complex Numbers

Insights Fermat's Last Theorem