Covariance matrix size: 3x3 or 4x4?

fab13 · Mar 11, 2019

Hello,
I follow the post https://www.physicsforums.com/threads/cross-correlations-what-size-to-select-for-the-matrix.967222/#post-6141227 .

It talks about the constraints on cosmological parameters and forecast on futur Dark energy surveys with Fisher's matrix formalism.

Below a capture of Martin White's paper : this is the part of building the Fisher matrix with eq(17) where C_xy is the covariance matrix of observables.

Actually, I try to do cross-correlations with multi-populations (in my case, I have 3 populations of galaxies : BGS, LRG, ELG).

My main problem is about the formula : ##N_{a} = \bigg[1+\dfrac{1}{(\bar{n}\,P_{aa})}\bigg]## : indeed, some of my redshift bins have a ##\bar{n}## equal to 0 and that makes diverge the definition of ##N_{a}##.

Here is the bias (first column = redshift, (2,3,4) columns = bias_{BGS}, bias_{LRG}, bias_{ELG} and (5,6,7) columns = n_{BGS}, n_{LRG}, n_{ELG} :

# z b1 b2 b3 n1 n2 n3 f(z) G(z)
1.7500000000e-01 1.1133849956e+00 0.0000000000e+00 0.0000000000e+00 2.8623766865e+03 0.0000000000e+00 0.0000000000e+00 6.3004698923e-01 6.1334138190e-01
4.2500000000e-01 1.7983127401e+00 0.0000000000e+00 0.0000000000e+00 2.0412110815e+02 0.0000000000e+00 0.0000000000e+00 7.3889083367e-01 6.1334138190e-01
6.5000000000e-01 0.0000000000e+00 1.4469899900e+00 7.1498329000e-01 0.0000000000e+00 8.3200000000e+02 3.0900000000e+02 8.0866367407e-01 6.1334138190e-01
8.5000000000e-01 0.0000000000e+00 1.4194157200e+00 7.0135835000e-01 0.0000000000e+00 6.6200000000e+02 1.9230000000e+03 8.5346288717e-01 6.1334138190e-01
1.0500000000e+00 0.0000000000e+00 1.4006739400e+00 6.9209771000e-01 0.0000000000e+00 5.1000000000e+01 1.4410000000e+03 8.8639317400e-01 6.1334138190e-01
1.2500000000e+00 0.0000000000e+00 0.0000000000e+00 6.8562140000e-01 0.0000000000e+00 0.0000000000e+00 1.3370000000e+03 9.1074033422e-01 6.1334138190e-01
1.4500000000e+00 0.0000000000e+00 0.0000000000e+00 6.8097541000e-01 0.0000000000e+00 0.0000000000e+00 4.6600000000e+02 9.2892482272e-01 6.1334138190e-01
1.6500000000e+00 0.0000000000e+00 0.0000000000e+00 6.7756594000e-01 0.0000000000e+00 0.0000000000e+00 1.2600000000e+02 9.4267299703e-01 6.1334138190e-01

As you can see, there is an overlapping between b2 and b3, like n2 and n3 (3 redshifts bins that are overlapped).

Question 1) How can I deal with the values ##b_{i}## and ##n_{j}## equal to zero with the definition of ##N_{a} = \bigg[1+\dfrac{1}{(\bar{n}\,P_{aa})}\bigg]## ?

I mean how to avoid a division by zero while calculating the different coefficients ##C_{ijkl}## of covariance matrix ?

Question 2) Into my code, can I set a high value for ##N_{a}## when my redshift bin has a bias or n(i) equal to zero ?

Question 3) Do you think in paper that ##C_{XY}## is a 3x3 or 4x4 matrix ? a colleague told me it could be a 3x3 matrix whereas in my previous post ( https://www.physicsforums.com/threa...to-select-for-the-matrix.967222/#post-6141227 ), it seems to be considered as a 4x4 structure.

Regards

kimbyd · Mar 11, 2019

1) Much of the time, when there's no data (the bin is empty), you can simply remove the bin from the result and not have it contribute in any way. If the bin shows up as the diagonal part of a covariance matrix, where it would get an infinite result, then you should simply delete that row/column entirely. An infinite result in the diagonal elements of a covariance matrix means that there is no constraint whatsoever on the value of that parameter. That is a sensible answer, but it's not something the computer can work with effectively. So best to just remove it.

2) Yes, you can substitute a large value. But you may run into issues doing this, such as loss of numerical precision. Still, if you make the value a million times greater than the other values (give or take) it could work. Then you'd change the value to ten million times and see if your answer changes. But the result won't actually be useful. So it's much easier to simply remove it.

3) It's really hard to say from what you have written here. Depends upon how many rows/columns have genuinely useful data in them. The easiest way to check for sure would be to figure out the Fisher information matrix (##F_{ij}##) and see how many non-zero eigenvalues it has. The number of non-zero eigenvalues is the number of rows/columns you should have, full stop. Any repeated eigenvalues might also be removed, but you could get rid of the duplicates before constructing ##F_{ij}##.

fab13 · Mar 12, 2019

kimbyd said:

1) Much of the time, when there's no data (the bin is empty), you can simply remove the bin from the result and not have it contribute in any way. If the bin shows up as the diagonal part of a covariance matrix, where it would get an infinite result, then you should simply delete that row/column entirely. An infinite result in the diagonal elements of a covariance matrix means that there is no constraint whatsoever on the value of that parameter. That is a sensible answer, but it's not something the computer can work with effectively. So best to just remove it.

2) Yes, you can substitute a large value. But you may run into issues doing this, such as loss of numerical precision. Still, if you make the value a million times greater than the other values (give or take) it could work. Then you'd change the value to ten million times and see if your answer changes. But the result won't actually be useful. So it's much easier to simply remove it.

3) It's really hard to say from what you have written here. Depends upon how many rows/columns have genuinely useful data in them. The easiest way to check for sure would be to figure out the Fisher information matrix (##F_{ij}##) and see how many non-zero eigenvalues it has. The number of non-zero eigenvalues is the number of rows/columns you should have, full stop. Any repeated eigenvalues might also be removed, but you could get rid of the duplicates before constructing ##F_{ij}##.

To build Fisher matrix, what I have done is to sum, over the type of population, the inverse of covariance matrix but I don't know if this is correct. I have taken a high value for ##N_{a}## when ##\bar{n}## is null (because ##N_{a} = [1+1/(\bar{n} P_{aa})]## (the results on constraints that I got are the same over 1e+20, so this value is enough to represent ##N_{a}## with ##\bar{n}## is zero)

In my case, I have 3 populations of galaxies, so I get a 4x4 covariance matrix (aa, bb, cc, bc). I have 8 redshift bins but some nbar are equal to zero for each of the 3 populations (actually, I have an overlapping for 3 redshift bins between populations bb and cc). ##P## terms represents the matter power spectrum that depends on ##\mu##, ##k##, and redshift ##z##.

Question 1) I think that summing over the type of population for integrand (I mean the sum of inversed covariance matrix) and integrate the result on (##\mu,k##) will give little constraints (I mean little variance values). What do you think about it ? (in this version, I get a high Figure of Merit (FoM=347)). So we could say that with this summing, constraints are too much good and optimistic.

It seems the capture of the paper, one indicates there is not a sum over type of population for the building of Covariance matrix C_xy : is C_xy unique ? or Have I got to sum over type of population to build it, i.e there is one C_xy different for each population of galaxy ?

kimbyd · Mar 12, 2019

To try to get a handle on what you're doing, I'll imagine you have three bins which we will name X, Y, and Z, and three populations A, B, and C.

In bin Y, population C has data, but populations A and B do not.

However you do your sum, bin Y cannot contribute to any cell in the matrix touched by populations A and B. Bin Y can impact the CC matrix element, but that's it.

If you look at how the sums are done, the infinite result should actually make a degree of sense: when combining data from two data sets for a single value with a standard deviation ##\sigma##, then the combined variance on the estimated variable, ##\sigma_c^2##, is calculated using:
$${1 \over \sigma_c^2} = {1 \over \sigma_1^2} + {1 \over \sigma_2^2}$$

So if ##\sigma_2## is infinite but ##\sigma_1## is not, then ##\sigma_2## adds nothing to the sum and the result is simply that ##\sigma_c = \sigma_1##.

Thus eliminating components of the sum where there is no data is the right thing to do. The tricky thing is that you then have to actually delete rows/columns in the covariance matrix which have no contributions from any data source. I think in your case limiting it to simply the four values (aa, bb, cc, bc) should do that for you already.

fab13 · Mar 12, 2019

@kimbyd

Thanks for your patience, the issue is a little tricky. I have understood what you mean by replacing ##N_a## with a high value (since ##\bar{n} = 0## for different redshift, what you say by "no data").

Doing this, I can sum the 3 inverted covariance matrices (because I consider only 3 populations, with an overlapping between population "2" and population "3".

Below the inverse matrix of the 3 inverted covariance matrices computed for a given redshift bin (bin = 2 = third redshift bin, starting from bin=0) , and from the top to the bottom, I consider respectively "population 0", population "1" and population "2" (which are the only populations I use) :

With this redshift (number of bin = 3), I have initially no data for "population 0" and overlap on redshift between "population 1" and "population 2".

As you indicated it, the contribution of "population 0" is negligible (since I took a highvalue for ##N_{a}## term) : this way, I avoid division by zero (I prefer for the moment in my code doing this but I will delete rows/columns after surely).

QUESTION 1) I wonder if my method is correct, i.e if I can sum the 3 (one per population) inverse matrices (##(C)^{-1}_{xy}## in Martin White paper) : indeed, I know in Fisher formalism that we can sum 2 Fisher matrix of independent variables to get a final Fisher matrix that will allow to have more constraints on parameters when inverting it.

So I would like to know if I can do the same here, knowing there will be negligible terms like above.

Here for the moment how my code computes the term to integrate in order to get after the element of Fisher matrix "F_ij" :

As you can see, I did a loop on type of populations, so I build a covariance matrix for each population. Then I invert it for each (k,mu) and I sum the terms of the product to integrate, i.e the integrand ##\dfrac{\partial P_x}{\partial p_i}\,C_{xy}\,\dfrac{\partial P_y}{\partial p_j}## (the operation of integration doesn't appear here).

You can notice that I do the sum first on the 3 populations and inside, I do the sum on 4x4 elements (that correspond to the terms of the final covariance matrix.

So I keep trying to grasp between the hidden subtilities when we consider a "3x3 matrix" case or a "4x4 matrix" case.

QUESTION 2) What do you think about the following fourth structure used ("population 0", "population 1", "population 2" for respectively "aa" , "bb", "cc" and cross-correlation on "bc"

QUESTION 3) How can I delete the 4th row and 4th column, I mean how can I justify this suppression : is it caused by the terms LRGxELG/LRG, LRGxELG/ELG, and LRG/LRGxELG, ELG/LRGxELG : how can I prove they are equal to zero or assimilated to zero ?

Regards

ps : please forgive my lack of skills about this subject of multi-populations cross-correlations, I am new on this topic.

kimbyd said:

To try to get a handle on what you're doing, I'll imagine you have three bins which we will name X, Y, and Z, and three populations A, B, and C.

In bin Y, population C has data, but populations A and B do not.

However you do your sum, bin Y cannot contribute to any cell in the matrix touched by populations A and B. Bin Y can impact the CC matrix element, but that's it.

If you look at how the sums are done, the infinite result should actually make a degree of sense: when combining data from two data sets for a single value with a standard deviation ##\sigma##, then the combined variance on the estimated variable, ##\sigma_c^2##, is calculated using:
$${1 \over \sigma_c^2} = {1 \over \sigma_1^2} + {1 \over \sigma_2^2}$$

So if ##\sigma_2## is infinite but ##\sigma_1## is not, then ##\sigma_2## adds nothing to the sum and the result is simply that ##\sigma_c = \sigma_1##.

Thus eliminating components of the sum where there is no data is the right thing to do. The tricky thing is that you then have to actually delete rows/columns in the covariance matrix which have no contributions from any data source. I think in your case limiting it to simply the four values (aa, bb, cc, bc) should do that for you already.

kimbyd · Mar 13, 2019

As to your questions:
1) Yes, you can sum up as many inverse covariance matrices as you want. The fake values are a bit risky, but you can verify the risk by changing them and seeing if that changes your result.
2) That's a good structure, but I'd switch up the notation. This is fundamentally a fourth-order structure that is being packed into a 2D matrix. So the diagonal elements should be BGS^4, LRG^4, ELG^4, LRG^2 ELG^2, and the off-diagonal components scaled to match. They aren't exactly this, of course, but it is at least a bit more true to what is actually going on. There's also a minor error in labeling the rows/columns: the last three should be "ab, ac, bc," but ab and ac are zero.

You definitely can't get rid of the "bc" row/column, but you could get rid of the "aa" row/column, and treat it as a separate entity altogether (because the aa row/column is independent of the others). This probably won't help you with anything, but it could be done.

3) You definitely can't delete the 4th row/column. If the bbcc element isn't zero, then the bbbc, bbcc, bcbb, bccc, bcbc elements can't be zero either. They all matter.

Covariance matrix size: 3x3 or 4x4?

Attachments

Attachments

1. What is a covariance matrix?

2. Why are there different sizes for covariance matrices?

3. How is a covariance matrix calculated?

4. What is the significance of the size of a covariance matrix?

5. Can a covariance matrix be used for any type of data?

Similar threads

Hot Threads

Recent Insights