Rank of sample covariance matrix

Click For Summary
SUMMARY

The discussion centers on the rank of the sample covariance matrix as presented in Turk and Pentland's paper 'Eigenfaces for recognition'. It is established that when the number of samples (M) is less than the number of features (N), the maximum rank of the covariance matrix, calculated as X*X', is M, not M - 1. The rank can be reduced by constraints such as mean normalization, which is evident in the authors' methodology where the mean column is subtracted from each column, resulting in a rank reduction. A counter-example demonstrates that the rank can indeed be M under typical conditions.

PREREQUISITES
  • Understanding of covariance matrices and their properties
  • Familiarity with linear algebra concepts, particularly matrix rank
  • Knowledge of matrix operations, specifically matrix multiplication and transposition
  • Experience with data normalization techniques in statistical analysis
NEXT STEPS
  • Explore the implications of covariance matrix rank in Principal Component Analysis (PCA)
  • Learn about the effects of data normalization on covariance matrices
  • Investigate the relationship between sample size and rank in statistical modeling
  • Study the mathematical foundations of eigenvalues and eigenvectors in relation to covariance matrices
USEFUL FOR

Data scientists, statisticians, and researchers in machine learning who are working with covariance matrices and dimensionality reduction techniques.

fmilano
Messages
7
Reaction score
0
I was reading Turk and Pentland paper 'Eigenfaces for recognition' and they assert that, if M < N, the maximum rank of a covariance matrix is M - 1, being M the number of samples and NxN the size of the covariance matrix.

Is there any simple demonstration of this fact?

Thanks in advance,

Federico
 
Physics news on Phys.org
This seems incorrect to me, it should be M.

The reason is we usually capture the covariance matrix using X*X' where X is an NxM matrix of N features by M observations (usually we normalize X so that each row [feature] has mean 0 and variance 1, or some similar process, so that we are getting the standard covariance estimate).

Then of course the rank of X*X' can be at most M for M < N because you may view applying the transformation X*X' as applying the transformation X' followed by applying the transformation X. Then because X' is an MxN matrix taking R^N to R^M, so its range (i.e. rank) cannot be greater than the dimension of R^M, namely M. Then, X takes R^M to a subspace of R^N, with dimension no greater than M (this follows from linear algebra, but you may simply think about it like this: X is taking the entire subspace of R^M (the range of X') to somewhere in R^N, since it is a linear function, it cannot fill up more of R^N then what is put into it, namely it cannot output more than the image of X on the subspace of R^M, so again the dimension of the range is no greater than M). Thus X*X' can have rank no greater than M.
(Note also I use X' to denote transpose of X and R^n, assuming real number system).

Assuming this is what they mean by covariance matrix, it is easy to come up with a counter-example to the rank = M-1 claim, unless there is more information missing, i.e. some other constraint given on the variables. In fact generate a random NXM matrix X and take the covariance matrix, or equivalently normalize it properly and take X*X' and most of the time the rank will be M.
 
brian44's argument shows that the rank can be no greater than M but nevertheless it could be less (as he hints), depending on what the data actually is.

I had a look at Turk and Pentland's paper (Eigenfaces for recognition) and in their case the M columns of X (or A in their notation) are not independent because they have subtracted the mean column from each column (i.e. the sum along each row is zero). In their notation, Φi = Γi - Ψ. This constraint reduces the rank by one.
 

Similar threads

  • · Replies 4 ·
Replies
4
Views
2K
  • · Replies 2 ·
Replies
2
Views
3K
  • · Replies 1 ·
Replies
1
Views
1K
  • · Replies 5 ·
Replies
5
Views
2K
Replies
2
Views
3K
  • · Replies 1 ·
Replies
1
Views
1K
  • · Replies 4 ·
Replies
4
Views
2K
  • · Replies 5 ·
Replies
5
Views
2K
  • · Replies 1 ·
Replies
1
Views
51K
  • · Replies 2 ·
Replies
2
Views
3K