PCA and variance on particular axis

Asuralm
Messages
35
Reaction score
0
Hi All:

If given a set of 3D points data, it's very easy to calculate the covariance matrix and get the principle axises. And the eigenvalue will be the variance on the principle axis. I have a problem that if given a random direction, how do I calculate the variance of the data on the given direction?

Can anybody help me with this please?

Thanks
 
Mathematics news on Phys.org
Speaking of Applicable Geometry

How much do you know about the relationship between euclidean geometry and mean/variance?

Your question really concerns the values taken by a quadratic form. Many statistical manipulations (and many useful properties of normal distributions) arise in a geometrically natural manner from manipulating quadratic forms by orthogonal transformations and orthoprojections, together with some notions from affine geometry such as convexity. For example, taking the mean of n variables x_1, \, x_2, \dots x_n, where we think of this data as the vector \vec{x} = x_1 \, \vec{e}_1 + \dots x_n \, \vec{e}_n, corresponds to taking the orthoprojection (defined using standard euclidean inner product) onto the one dimensional subspace spanned by \vec{e}_1 + \vec{e}_2 + \dots \vec{e}_n. If we adopt a new orthonormal basis including the unit vector \vec{f}_n = \frac{1}{\sqrt{n}} \, \left( \vec{e}_1 + \vec{e}_2 + \dots \vec{e}_n \right), this orthoprojection can be thought of very simply, as simply forgetting all but the last component \sqrt{n} \, \overline{x}, which agrees (up to a constant multiple) with the arithmetic mean.

See M. G. Kendall, A Course in the Geometry of n Dimensions, Dover reprint, and then try the same author's book Multivariate Analysis.

I must add a caution: do you see why principle component analysis (PCA) is essentially a method for "lying with statistics"? That is, the geometric (or if you prefer, linear algebraic) manipulations of your data set are mathematically valid, but the statistical interpretation is almost always extremely dubious. Fortunately, my remark about the role of euclidean geometry in mathematical statistics holds true for many more legitimate statistical methods, some discussed in the first book by Kendall cited above.
 
Last edited:
Insights auto threads is broken atm, so I'm manually creating these for new Insight articles. In Dirac’s Principles of Quantum Mechanics published in 1930 he introduced a “convenient notation” he referred to as a “delta function” which he treated as a continuum analog to the discrete Kronecker delta. The Kronecker delta is simply the indexed components of the identity operator in matrix algebra Source: https://www.physicsforums.com/insights/what-exactly-is-diracs-delta-function/ by...
Fermat's Last Theorem has long been one of the most famous mathematical problems, and is now one of the most famous theorems. It simply states that the equation $$ a^n+b^n=c^n $$ has no solutions with positive integers if ##n>2.## It was named after Pierre de Fermat (1607-1665). The problem itself stems from the book Arithmetica by Diophantus of Alexandria. It gained popularity because Fermat noted in his copy "Cubum autem in duos cubos, aut quadratoquadratum in duos quadratoquadratos, et...
Thread 'Imaginary pythagorus'
I posted this in the Lame Math thread, but it's got me thinking. Is there any validity to this? Or is it really just a mathematical trick? Naively, I see that i2 + plus 12 does equal zero2. But does this have a meaning? I know one can treat the imaginary number line as just another axis like the reals, but does that mean this does represent a triangle in the complex plane with a hypotenuse of length zero? Ibix offered a rendering of the diagram using what I assume is matrix* notation...
Back
Top