Estimating the variance of eigenvalues of sample covariance matrices in Matlab

weetabixharry · Nov 3, 2012

I am trying to investigate the statistical variance of the eigenvalues of sample covariance matrices using Matlab. To clarify, each sample covariance matrix, [itex]\hat{\mathbb{R}}_{nn}[/itex], is constructed from a finite number, [itex]N[/itex], of vector snapshots, each sized [itex](L_{vec} \times 1)[/itex] (afflicted with random zero-mean white Gaussian noise). Then, over a large number of trials, a large number of such matrices are generated and eigendecomposed in order to estimate the theoretical statistics of the sample eigenvalues, [itex]\hat{\lambda}_1,\hat{\lambda}_2,\dots, \hat{\lambda}_{L_{vec}}[/itex].

According to several sources (see, for example, [1, Eq.3] and [2, Eq.11]), the covariances of the sample eigenvalues should be:

[itex]E \{ (\hat{\lambda}_i - \lambda_i)(\hat{\lambda}_j - \lambda_j)\} \approx \delta_{ij} \frac{\lambda_i^2}{N}[/itex]

where [itex]\delta_{ij}[/itex] is the Kronecker delta and [itex]\lambda_i[/itex] are the exact eigenvalues of the theoretical covariance matrix, [itex]\mathbb{R}_{nn}[/itex].

However, the results I get from Matlab aren't even close. Is this an issue with my code? With Matlab/rounding errors etc.? (I've never had such trouble working on similar problems).

Here's a very simple Matlab code example:

Code:

% Data vector length
Lvec = 5;
% Number of snapshots per sample covariance matrix
N = 200;
% Number of simulation trials
Ntrials = 10000;
% Noise variance
sigma2 = 10;

% Theoretical covariance matrix
Rnn_th = sigma2*eye(Lvec);
% Theoretical eigenvalues (should all be sigma2)
lambda_th = sort(eig(Rnn_th),'descend');

lambda = zeros(Lvec,Ntrials);
for trial = 1:Ntrials
    % Generate new (complex) white Gaussian noise data
    n = sqrt(sigma2/2)*(randn(Lvec,N) + 1j*randn(Lvec,N));
    % Sample covariance matrix
    Rnn = n*n'/N;
    % Save sample eigenvalues
    lambda(:,trial) = sort(eig(Rnn),'descend');   
end

% Estimated eigenvalue covariance matrix
b = lambda - lambda_th(:,ones(1,Ntrials));
Rbb = b*b'/Ntrials
% Predicted (approximate) theoretical result
Rbb_th_approx = diag(lambda_th.^2/N)

References:
[1] Friedlander, B.; Weiss, A.J.; , "On the second-order statistics of the eigenvectors of sample covariance matrices," Signal Processing, IEEE Transactions on , vol.46, no.11, pp.3136-3139, Nov 1998
[2] Kaveh, M.; Barabell, A.; , "The statistical performance of the MUSIC and the minimum-norm algorithms in resolving plane waves in noise," Acoustics, Speech and Signal Processing, IEEE Transactions on , vol.34, no.2, pp. 331- 341, Apr 1986

Stephen Tashi · Nov 3, 2012

weetabixharry said:

[tex]E \{ (\hat{\lambda}_i - \lambda_i)(\hat{\lambda}_j - \lambda_j)\} \approx \delta_{ij} \frac{\lambda_i^2}{N}[/tex]

where [itex]\delta_{ij}[/itex] is the Kronecker delta and [itex]\lambda_i[/itex] are the exact eigenvalues of the theoretical covariance matrix, [itex]\mathbb{R}_{nn}[/itex].

I suggest you try to find that result in an online source and link to it. I don't know how you can evaluate your results without know the variance of the sample eigenvalues.

Also, your simulation (as I read it) assumes a sample covariance matrix for a vector of N variables is generated from a sample of N random vectors. The theoretical result may have some assumption about keeping the number of random variables N constant and computing the covariance matrix from M samples where M is larger than N or where M approaches infinity.

weetabixharry · Nov 3, 2012

Stephen Tashi said:

I suggest you try to find that result in an online source and link to it.

Please see Equation 11 (and the sentences directly before that) here:

http://www.scribd.com/doc/112046406/Kaveh-86

Stephen Tashi said:

I don't know how you can evaluate your results without know the variance of the sample eigenvalues.

I estimate the variance using a large number of simulation trials. I expect this to be similar to the theoretically predicted result, but it is not.

Stephen Tashi said:

Also, your simulation (as I read it) assumes a sample covariance matrix for a vector of N variables is generated from a sample of N random vectors.

It is a vector of Lvec variables. The number of snapshots is N.

chiro · Nov 3, 2012

Can you estimate your covariance matrix from your sample and then do the analysis on that matrix to get an estimate?

You may want to see if this technique produces an estimate that is unbiased (or close enough) if you need to verify this.

It would involve you showing that if you have a covariance matrix for your distribution C with eigen-values lamda, then obtaining a covariance matrix C' (estimating from your sample) produces an unbiased estimator for lambda_hat (and possibly for the eigen-vectors as well).

Another comment is also to see if you can use an MLE estimator with the invariance principle because if the lambda's can be found by a transformation of said covariance matrix (which they can through the characteristic function) then it means that the lambda's will be an estimate of the covariance matrix is given under MLE estimator construction to estimate the original covariance matrix.

weetabixharry · Nov 4, 2012

chiro said:

Can you estimate your covariance matrix from your sample and then do the analysis on that matrix to get an estimate?

If I save all the sample covariance matrices for all trials, then compute the mean across all trials, this is approximately equal to the theoretical covariance matrix. (Is this what you mean?).

chiro said:

It would involve you showing that if you have a covariance matrix for your distribution C with eigen-values lamda, then obtaining a covariance matrix C' (estimating from your sample) produces an unbiased estimator for lambda_hat (and possibly for the eigen-vectors as well).

The means of the sample eigenvalues are equal (approximately) to the theoretical values.

chiro said:

... it means that the lambda's will be an estimate of the covariance matrix is given under MLE estimator construction to estimate the original covariance matrix.

I didn't quite follow this. I'm not too familiar with MLE.

Stephen Tashi · Nov 5, 2012

weetabixharry,

I tried various modifications of your code and I could not get results that agree with equation 11 of the paper you linked. I confirm that the mean of the covariance matrices is approximately the theoretical covariance matrix.

That paper cites D. R. Brillinger, Time Series: Data Analysis and Theory, Holt, Reinhart and Windston, 1975. We need to look at what that book says - or find another reference that derives equation 11.

Does any forum member have access to that book?

weetabixharry · Nov 5, 2012

Stephen Tashi said:

weetabixharry,
That paper cites D. R. Brillinger, Time Series: Data Analysis and Theory, Holt, Reinhart and Windston, 1975. We need to look at what that book says - or find another reference that derives equation 11.

Does any forum member have access to that book?

I have access to the book, but the relevant theorem regarding the second order statistics of the sample eigenvalues and eigenvectors (Theorem 9.2.4, p. 343) only provides the following proof:

"Theorem 9.2.4 results from two facts: the indicated latent roots and vectors are differentiable functions of the entries of [itex]\hat{\mathbb{R}}_{nn}[/itex] and [itex]\hat{\mathbb{R}}_{nn}[/itex] is asymptotically normal as [itex]N \rightarrow \infty[/itex]."

I am confident that these results should be relevant for my purposes, as the main result of [2] is one I am very familiar with and have confirmed extensively using Matlab simulations. It just seems strange that I can't match to this intermediate step.

Stephen Tashi · Nov 5, 2012

weetabixharry said:

"Theorem 9.2.4 results from two facts: the indicated latent roots and vectors are differentiable functions of the entries of [itex]\hat{\mathbb{R}}_{nn}[/itex] and [itex]\hat{\mathbb{R}}_{nn}[/itex] is asymptotically normal as [itex]N \rightarrow \infty[/itex]."

What a theorem means isn't clear unless the theorem is stated, including defining all the quantities involved and all assumptions it makes about them. I'm interested in whether the result about the covariance may make some assumptions that are not true in the Monte-Carlo simulation.

My impression of what you gave is that I don't see that the estimate of the kth eigenvalue is a differentiable function of a sample matrix since computing them from the sample involves putting the estimated eigenvalues in a definite order. This isn't as simple as using the theory of order statistics to sample each component of a vector of values independently from the same distribution.

weetabixharry · Nov 5, 2012

Stephen Tashi said:

I don't see that the estimate of the kth eigenvalue is a differentiable function of a sample matrix since computing them from the sample involves putting the estimated eigenvalues in a definite order.

Yes, the one thing I was suspicious of was the ordering. However, I'm not sure how I could fix this. Perhaps if I gave each element of the random vector significantly different powers, then ordering them wouldn't mess things up? Or is there something else I can do in general?

Stephen Tashi said:

What a theorem means isn't clear unless the theorem is stated, including defining all the quantities involved and all assumptions it makes about them.

Here's the statement of the theorem (where it seems their r is my Lvec and their n is my N):
"Theorem 9.2.4 Suppose the values [itex]X_1,...,X_n[/itex] are a sample from [itex]\mathcal{N}_r^C(0,\mathbb{R}_{xx})[/itex]. Suppose the roots of [itex]\mathbb{R}_{xx}[/itex] are distinct. Then the variate [itex]\{\hat{\lambda}_j,\hat{{V}_j};j=1,\dots,r\}[/itex] is asymptotically normal with with [itex]\{\hat{\lambda}_j;j=1,\dots,r\}[/itex] asymptotically independent of [itex]\{\hat{{V}_j};j=1,\dots,r\}[/itex]. The asymptotic moments are given by:"

(It then lists out some of the results repeated in [1], including the sample eigenvalue variance.)

weetabixharry · Nov 5, 2012

weetabixharry said:

Perhaps if I gave each element of the random vector significantly different powers, then ordering them wouldn't mess things up?

I tried this out and it seems to work... suggesting the problem may be due to the ordering. In general, therefore, how can I fix the ordering? Perhaps try to match up the associated eigenvectors as best as possible?

Stephen Tashi · Nov 5, 2012

weetabixharry said:

Here's the statement of the theorem (where it seems their r is my Lvec and their n is my N):
"Theorem 9.2.4 Suppose the values [itex]X_1,...,X_n[/itex] are a sample from [itex]\mathcal{N}_r^C(0,\mathbb{R}_{xx})[/itex]. Suppose the roots of [itex]\mathbb{R}_{xx}[/itex] are distinct. Then the variate [itex]\{\hat{\lambda}_j,\hat{{V}_j};j=1,\dots,r\}[/itex] is asymptotically normal with with [itex]\{\hat{\lambda}_j;j=1,\dots,r\}[/itex] asymptotically independent of [itex]\{\hat{{V}_j};j=1,\dots,r\}[/itex]. The asymptotic moments are given by:"

(It then lists out some of the results repeated in [1], including the sample eigenvalue variance.)

It would help to know what [itex]\mathcal{N}_r^C(0,\mathbb{R}_{xx})[/itex] is. Does the fact the roots of [itex]\mathbb{R}_{xx}[/itex] are distinct have to do with some eigenvalues being distinct?

weetabixharry · Nov 5, 2012

Stephen Tashi said:

It would help to know what [itex]\mathcal{N}_r^C(0,\mathbb{R}_{xx})[/itex] is.

As far as I can tell, this means we're concerned with an (r x 1) complex random vector which has a (complex) normal distribution, with mean(s) zero and covariance matrix [itex]\mathbb{R}_{xx}[/itex].

Stephen Tashi said:

Does the fact the roots of [itex]\mathbb{R}_{xx}[/itex] are distinct have to do with some eigenvalues being distinct?

I find this book very difficult to follow, but it appears to me that "roots" means eigenvalues(?) (I'm not really familiar with what the root of a matrix is... as we're clearly not talking about matrix square roots here).

I am sure, however, that it is not necessary for all eigenvalues to have different values, since [2] deals with scenarios where (r - 2) eigenvalues are identical.

Estimating the variance of eigenvalues of sample covariance matrices in Matlab

Discussion Overview

Discussion Character

Main Points Raised

Areas of Agreement / Disagreement

Contextual Notes

Who May Find This Useful

Similar threads

MATLAB MATLAB GPU Cross-Correlation Code Fails with OOM on 8 GB Supercomputer

Cone‑like layered structure in 2D FFT

Insights Revisiting the Velocity-Time Function

Insights Remote Operated Gate Control System

Insights AI Enriched Problem Solving

Insights Thinking Outside The Box Versus Knowing What’s In The Box

Insights Why Entangled Photon-Polarization Qubits Violate Bell’s Inequality

Insights Quantum Entanglement is a Kinematic Fact, not a Dynamical Effect