Estimating the variance of eigenvalues of sample covariance matrices in Matlab

Click For Summary

Discussion Overview

The discussion revolves around estimating the variance of eigenvalues of sample covariance matrices using Matlab. Participants explore the theoretical expectations versus the results obtained from simulations, focusing on the statistical properties of eigenvalues derived from finite samples afflicted with random noise. The scope includes theoretical analysis, simulation results, and potential discrepancies in expected outcomes.

Discussion Character

  • Exploratory
  • Technical explanation
  • Debate/contested

Main Points Raised

  • One participant describes their approach to generating sample covariance matrices and calculating eigenvalues, referencing theoretical results for the variance of eigenvalues.
  • Another participant suggests that the theoretical results may depend on assumptions about the number of random variables and samples, indicating a potential mismatch with the simulation setup.
  • Some participants propose verifying the covariance matrix estimation from samples and its implications for unbiased estimators of eigenvalues.
  • There are discussions about the relevance of specific equations from referenced papers and the need for additional sources to clarify theoretical expectations.
  • A later reply questions the differentiability of eigenvalue estimates in relation to the sample matrix, suggesting that the ordering of eigenvalues may complicate their statistical properties.

Areas of Agreement / Disagreement

Participants express differing views on the assumptions underlying the theoretical results and their applicability to the simulation. There is no consensus on the reasons for discrepancies between theoretical predictions and simulation results.

Contextual Notes

Some participants note that the theoretical results may involve assumptions about the relationship between the number of variables and samples that are not met in the simulations. There are also concerns regarding the interpretation of theorems and their applicability to Monte-Carlo simulations.

Who May Find This Useful

This discussion may be useful for researchers and practitioners interested in statistical properties of eigenvalues in sample covariance matrices, particularly in the context of simulations and theoretical analysis in signal processing and related fields.

weetabixharry
Messages
111
Reaction score
0
I am trying to investigate the statistical variance of the eigenvalues of sample covariance matrices using Matlab. To clarify, each sample covariance matrix, \hat{\mathbb{R}}_{nn}, is constructed from a finite number, N, of vector snapshots, each sized (L_{vec} \times 1) (afflicted with random zero-mean white Gaussian noise). Then, over a large number of trials, a large number of such matrices are generated and eigendecomposed in order to estimate the theoretical statistics of the sample eigenvalues, \hat{\lambda}_1,\hat{\lambda}_2,\dots, \hat{\lambda}_{L_{vec}}.

According to several sources (see, for example, [1, Eq.3] and [2, Eq.11]), the covariances of the sample eigenvalues should be:

E \{ (\hat{\lambda}_i - \lambda_i)(\hat{\lambda}_j - \lambda_j)\} \approx \delta_{ij} \frac{\lambda_i^2}{N}

where \delta_{ij} is the Kronecker delta and \lambda_i are the exact eigenvalues of the theoretical covariance matrix, \mathbb{R}_{nn}.

However, the results I get from Matlab aren't even close. Is this an issue with my code? With Matlab/rounding errors etc.? (I've never had such trouble working on similar problems).

Here's a very simple Matlab code example:

Code:
% Data vector length
Lvec = 5;
% Number of snapshots per sample covariance matrix
N = 200;
% Number of simulation trials
Ntrials = 10000;
% Noise variance
sigma2 = 10;

% Theoretical covariance matrix
Rnn_th = sigma2*eye(Lvec);
% Theoretical eigenvalues (should all be sigma2)
lambda_th = sort(eig(Rnn_th),'descend');

lambda = zeros(Lvec,Ntrials);
for trial = 1:Ntrials
    % Generate new (complex) white Gaussian noise data
    n = sqrt(sigma2/2)*(randn(Lvec,N) + 1j*randn(Lvec,N));
    % Sample covariance matrix
    Rnn = n*n'/N;
    % Save sample eigenvalues
    lambda(:,trial) = sort(eig(Rnn),'descend');   
end

% Estimated eigenvalue covariance matrix
b = lambda - lambda_th(:,ones(1,Ntrials));
Rbb = b*b'/Ntrials
% Predicted (approximate) theoretical result
Rbb_th_approx = diag(lambda_th.^2/N)

References:
[1] Friedlander, B.; Weiss, A.J.; , "On the second-order statistics of the eigenvectors of sample covariance matrices," Signal Processing, IEEE Transactions on , vol.46, no.11, pp.3136-3139, Nov 1998
[2] Kaveh, M.; Barabell, A.; , "The statistical performance of the MUSIC and the minimum-norm algorithms in resolving plane waves in noise," Acoustics, Speech and Signal Processing, IEEE Transactions on , vol.34, no.2, pp. 331- 341, Apr 1986
 
Physics news on Phys.org
weetabixharry said:
E \{ (\hat{\lambda}_i - \lambda_i)(\hat{\lambda}_j - \lambda_j)\} \approx \delta_{ij} \frac{\lambda_i^2}{N}

where \delta_{ij} is the Kronecker delta and \lambda_i are the exact eigenvalues of the theoretical covariance matrix, \mathbb{R}_{nn}.

I suggest you try to find that result in an online source and link to it. I don't know how you can evaluate your results without know the variance of the sample eigenvalues.

Also, your simulation (as I read it) assumes a sample covariance matrix for a vector of N variables is generated from a sample of N random vectors. The theoretical result may have some assumption about keeping the number of random variables N constant and computing the covariance matrix from M samples where M is larger than N or where M approaches infinity.
 
Stephen Tashi said:
I suggest you try to find that result in an online source and link to it.
Please see Equation 11 (and the sentences directly before that) here:

http://www.scribd.com/doc/112046406/Kaveh-86

Stephen Tashi said:
I don't know how you can evaluate your results without know the variance of the sample eigenvalues.
I estimate the variance using a large number of simulation trials. I expect this to be similar to the theoretically predicted result, but it is not.

Stephen Tashi said:
Also, your simulation (as I read it) assumes a sample covariance matrix for a vector of N variables is generated from a sample of N random vectors.
It is a vector of Lvec variables. The number of snapshots is N.
 
Can you estimate your covariance matrix from your sample and then do the analysis on that matrix to get an estimate?

You may want to see if this technique produces an estimate that is unbiased (or close enough) if you need to verify this.

It would involve you showing that if you have a covariance matrix for your distribution C with eigen-values lamda, then obtaining a covariance matrix C' (estimating from your sample) produces an unbiased estimator for lambda_hat (and possibly for the eigen-vectors as well).

Another comment is also to see if you can use an MLE estimator with the invariance principle because if the lambda's can be found by a transformation of said covariance matrix (which they can through the characteristic function) then it means that the lambda's will be an estimate of the covariance matrix is given under MLE estimator construction to estimate the original covariance matrix.
 
chiro said:
Can you estimate your covariance matrix from your sample and then do the analysis on that matrix to get an estimate?
If I save all the sample covariance matrices for all trials, then compute the mean across all trials, this is approximately equal to the theoretical covariance matrix. (Is this what you mean?).

chiro said:
It would involve you showing that if you have a covariance matrix for your distribution C with eigen-values lamda, then obtaining a covariance matrix C' (estimating from your sample) produces an unbiased estimator for lambda_hat (and possibly for the eigen-vectors as well).
The means of the sample eigenvalues are equal (approximately) to the theoretical values.

chiro said:
... it means that the lambda's will be an estimate of the covariance matrix is given under MLE estimator construction to estimate the original covariance matrix.
I didn't quite follow this. I'm not too familiar with MLE.
 
weetabixharry,

I tried various modifications of your code and I could not get results that agree with equation 11 of the paper you linked. I confirm that the mean of the covariance matrices is approximately the theoretical covariance matrix.

That paper cites D. R. Brillinger, Time Series: Data Analysis and Theory, Holt, Reinhart and Windston, 1975. We need to look at what that book says - or find another reference that derives equation 11.

Does any forum member have access to that book?
 
Stephen Tashi said:
weetabixharry,
That paper cites D. R. Brillinger, Time Series: Data Analysis and Theory, Holt, Reinhart and Windston, 1975. We need to look at what that book says - or find another reference that derives equation 11.

Does any forum member have access to that book?

I have access to the book, but the relevant theorem regarding the second order statistics of the sample eigenvalues and eigenvectors (Theorem 9.2.4, p. 343) only provides the following proof:

"Theorem 9.2.4 results from two facts: the indicated latent roots and vectors are differentiable functions of the entries of \hat{\mathbb{R}}_{nn} and \hat{\mathbb{R}}_{nn} is asymptotically normal as N \rightarrow \infty."

I am confident that these results should be relevant for my purposes, as the main result of [2] is one I am very familiar with and have confirmed extensively using Matlab simulations. It just seems strange that I can't match to this intermediate step.
 
weetabixharry said:
"Theorem 9.2.4 results from two facts: the indicated latent roots and vectors are differentiable functions of the entries of \hat{\mathbb{R}}_{nn} and \hat{\mathbb{R}}_{nn} is asymptotically normal as N \rightarrow \infty."

What a theorem means isn't clear unless the theorem is stated, including defining all the quantities involved and all assumptions it makes about them. I'm interested in whether the result about the covariance may make some assumptions that are not true in the Monte-Carlo simulation.

My impression of what you gave is that I don't see that the estimate of the kth eigenvalue is a differentiable function of a sample matrix since computing them from the sample involves putting the estimated eigenvalues in a definite order. This isn't as simple as using the theory of order statistics to sample each component of a vector of values independently from the same distribution.
 
Last edited:
Stephen Tashi said:
I don't see that the estimate of the kth eigenvalue is a differentiable function of a sample matrix since computing them from the sample involves putting the estimated eigenvalues in a definite order.
Yes, the one thing I was suspicious of was the ordering. However, I'm not sure how I could fix this. Perhaps if I gave each element of the random vector significantly different powers, then ordering them wouldn't mess things up? Or is there something else I can do in general?

Stephen Tashi said:
What a theorem means isn't clear unless the theorem is stated, including defining all the quantities involved and all assumptions it makes about them.
Here's the statement of the theorem (where it seems their r is my Lvec and their n is my N):
"Theorem 9.2.4 Suppose the values X_1,...,X_n are a sample from \mathcal{N}_r^C(0,\mathbb{R}_{xx}). Suppose the roots of \mathbb{R}_{xx} are distinct. Then the variate \{\hat{\lambda}_j,\hat{{V}_j};j=1,\dots,r\} is asymptotically normal with with \{\hat{\lambda}_j;j=1,\dots,r\} asymptotically independent of \{\hat{{V}_j};j=1,\dots,r\}. The asymptotic moments are given by:"


(It then lists out some of the results repeated in [1], including the sample eigenvalue variance.)
 
  • #10
weetabixharry said:
Perhaps if I gave each element of the random vector significantly different powers, then ordering them wouldn't mess things up?

I tried this out and it seems to work... suggesting the problem may be due to the ordering. In general, therefore, how can I fix the ordering? Perhaps try to match up the associated eigenvectors as best as possible?
 
  • #11
weetabixharry said:
Here's the statement of the theorem (where it seems their r is my Lvec and their n is my N):
"Theorem 9.2.4 Suppose the values X_1,...,X_n are a sample from \mathcal{N}_r^C(0,\mathbb{R}_{xx}). Suppose the roots of \mathbb{R}_{xx} are distinct. Then the variate \{\hat{\lambda}_j,\hat{{V}_j};j=1,\dots,r\} is asymptotically normal with with \{\hat{\lambda}_j;j=1,\dots,r\} asymptotically independent of \{\hat{{V}_j};j=1,\dots,r\}. The asymptotic moments are given by:"

(It then lists out some of the results repeated in [1], including the sample eigenvalue variance.)

It would help to know what \mathcal{N}_r^C(0,\mathbb{R}_{xx}) is. Does the fact the roots of \mathbb{R}_{xx} are distinct have to do with some eigenvalues being distinct?
 
  • #12
Stephen Tashi said:
It would help to know what \mathcal{N}_r^C(0,\mathbb{R}_{xx}) is.
As far as I can tell, this means we're concerned with an (r x 1) complex random vector which has a (complex) normal distribution, with mean(s) zero and covariance matrix \mathbb{R}_{xx}.

Stephen Tashi said:
Does the fact the roots of \mathbb{R}_{xx} are distinct have to do with some eigenvalues being distinct?
I find this book very difficult to follow, but it appears to me that "roots" means eigenvalues(?) (I'm not really familiar with what the root of a matrix is... as we're clearly not talking about matrix square roots here).

I am sure, however, that it is not necessary for all eigenvalues to have different values, since [2] deals with scenarios where (r - 2) eigenvalues are identical.
 

Similar threads

  • · Replies 1 ·
Replies
1
Views
2K
  • · Replies 1 ·
Replies
1
Views
5K
  • · Replies 1 ·
Replies
1
Views
5K
  • · Replies 5 ·
Replies
5
Views
2K
Replies
1
Views
4K
  • · Replies 3 ·
Replies
3
Views
25K
  • · Replies 9 ·
Replies
9
Views
3K
  • · Replies 1 ·
Replies
1
Views
14K
  • · Replies 1 ·
Replies
1
Views
3K
  • · Replies 6 ·
Replies
6
Views
4K