Dismiss Notice
Join Physics Forums Today!
The friendliest, high quality science and math community on the planet! Everyone who loves science is here!

Rank of sample covariance matrix

  1. Dec 9, 2009 #1
    I was reading Turk and Pentland paper 'Eigenfaces for recognition' and they assert that, if M < N, the maximum rank of a covariance matrix is M - 1, being M the number of samples and NxN the size of the covariance matrix.

    Is there any simple demonstration of this fact?

    Thanks in advance,

  2. jcsd
  3. Dec 14, 2009 #2
    This seems incorrect to me, it should be M.

    The reason is we usually capture the covariance matrix using X*X' where X is an NxM matrix of N features by M observations (usually we normalize X so that each row [feature] has mean 0 and variance 1, or some similar process, so that we are getting the standard covariance estimate).

    Then of course the rank of X*X' can be at most M for M < N because you may view applying the transformation X*X' as applying the transformation X' followed by applying the transformation X. Then because X' is an MxN matrix taking R^N to R^M, so its range (i.e. rank) cannot be greater than the dimension of R^M, namely M. Then, X takes R^M to a subspace of R^N, with dimension no greater than M (this follows from linear algebra, but you may simply think about it like this: X is taking the entire subspace of R^M (the range of X') to somewhere in R^N, since it is a linear function, it cannot fill up more of R^N then what is put into it, namely it cannot output more than the image of X on the subspace of R^M, so again the dimension of the range is no greater than M). Thus X*X' can have rank no greater than M.
    (Note also I use X' to denote transpose of X and R^n, assuming real number system).

    Assuming this is what they mean by covariance matrix, it is easy to come up with a counter-example to the rank = M-1 claim, unless there is more information missing, i.e. some other constraint given on the variables. In fact generate a random NXM matrix X and take the covariance matrix, or equivalently normalize it properly and take X*X' and most of the time the rank will be M.
  4. Dec 14, 2009 #3


    User Avatar
    Science Advisor
    Gold Member

    brian44's argument shows that the rank can be no greater than M but nevertheless it could be less (as he hints), depending on what the data actually is.

    I had a look at Turk and Pentland's paper (Eigenfaces for recognition) and in their case the M columns of X (or A in their notation) are not independent because they have subtracted the mean column from each column (i.e. the sum along each row is zero). In their notation, Φi = Γi - Ψ. This constraint reduces the rank by one.
Know someone interested in this topic? Share this thread via Reddit, Google+, Twitter, or Facebook