PCA and variance on particular axis

  • Context: Undergrad 
  • Thread starter Thread starter Asuralm
  • Start date Start date
  • Tags Tags
    Axis Pca Variance
Click For Summary
SUMMARY

This discussion focuses on calculating the variance of a dataset projected onto a random direction in a 3D space using principles of Principal Component Analysis (PCA) and quadratic forms. The covariance matrix is essential for determining principal axes, while eigenvalues represent variance along these axes. The conversation highlights the geometric relationship between Euclidean geometry and statistical properties, emphasizing the importance of orthogonal transformations and orthoprojections. Caution is advised regarding the interpretation of PCA results, as they can be misleading despite their mathematical validity.

PREREQUISITES
  • Understanding of Principal Component Analysis (PCA)
  • Familiarity with covariance matrices and eigenvalues
  • Knowledge of quadratic forms in statistics
  • Basic concepts of Euclidean geometry and orthogonal transformations
NEXT STEPS
  • Explore the calculation of covariance matrices in 3D datasets
  • Learn about eigenvalue decomposition and its applications in PCA
  • Study the geometric interpretation of quadratic forms in statistical analysis
  • Investigate the implications of PCA in statistical modeling and its limitations
USEFUL FOR

Data scientists, statisticians, and researchers interested in multivariate analysis and the geometric foundations of statistical methods.

Asuralm
Messages
35
Reaction score
0
Hi All:

If given a set of 3D points data, it's very easy to calculate the covariance matrix and get the principle axises. And the eigenvalue will be the variance on the principle axis. I have a problem that if given a random direction, how do I calculate the variance of the data on the given direction?

Can anybody help me with this please?

Thanks
 
Physics news on Phys.org
Speaking of Applicable Geometry

How much do you know about the relationship between euclidean geometry and mean/variance?

Your question really concerns the values taken by a quadratic form. Many statistical manipulations (and many useful properties of normal distributions) arise in a geometrically natural manner from manipulating quadratic forms by orthogonal transformations and orthoprojections, together with some notions from affine geometry such as convexity. For example, taking the mean of n variables [itex]x_1, \, x_2, \dots x_n[/itex], where we think of this data as the vector [itex]\vec{x} = x_1 \, \vec{e}_1 + \dots x_n \, \vec{e}_n[/itex], corresponds to taking the orthoprojection (defined using standard euclidean inner product) onto the one dimensional subspace spanned by [itex]\vec{e}_1 + \vec{e}_2 + \dots \vec{e}_n[/itex]. If we adopt a new orthonormal basis including the unit vector [itex]\vec{f}_n = \frac{1}{\sqrt{n}} \, \left( \vec{e}_1 + \vec{e}_2 + \dots \vec{e}_n \right)[/itex], this orthoprojection can be thought of very simply, as simply forgetting all but the last component [itex]\sqrt{n} \, \overline{x}[/itex], which agrees (up to a constant multiple) with the arithmetic mean.

See M. G. Kendall, A Course in the Geometry of n Dimensions, Dover reprint, and then try the same author's book Multivariate Analysis.

I must add a caution: do you see why principle component analysis (PCA) is essentially a method for "lying with statistics"? That is, the geometric (or if you prefer, linear algebraic) manipulations of your data set are mathematically valid, but the statistical interpretation is almost always extremely dubious. Fortunately, my remark about the role of euclidean geometry in mathematical statistics holds true for many more legitimate statistical methods, some discussed in the first book by Kendall cited above.
 
Last edited:

Similar threads

  • · Replies 1 ·
Replies
1
Views
2K
  • · Replies 6 ·
Replies
6
Views
2K
  • · Replies 6 ·
Replies
6
Views
2K
  • · Replies 3 ·
Replies
3
Views
2K
  • · Replies 2 ·
Replies
2
Views
9K
  • · Replies 5 ·
Replies
5
Views
4K
  • · Replies 4 ·
Replies
4
Views
3K
  • · Replies 1 ·
Replies
1
Views
3K
  • · Replies 4 ·
Replies
4
Views
2K
  • · Replies 2 ·
Replies
2
Views
2K