Covariance matrix with asymmetric uncertainties

Click For Summary
SUMMARY

The discussion focuses on calculating the Chi-Squared statistic using a covariance matrix that incorporates asymmetric uncertainties in a dataset related to cosmic proton flux. The covariance matrix is constructed with both statistical and systematic uncertainties, represented as \sigma^2_{1, stat} + \sigma^2_{1, syst} and \rho_{12} \sigma_{1,syst} \sigma_{2, syst}. The user seeks methods to handle asymmetric uncertainties effectively during minimization processes, suggesting two approaches: using symmetric uncertainties for initial estimates and exploring external likelihood calculations for more accurate results. The discussion emphasizes the importance of accurately representing uncertainties to avoid misleading conclusions.

PREREQUISITES
  • Understanding of covariance matrices in statistical analysis
  • Familiarity with Chi-Squared minimization techniques
  • Knowledge of likelihood functions and their applications
  • Experience with systematic and statistical uncertainties in data analysis
NEXT STEPS
  • Research methods for constructing covariance matrices with asymmetric uncertainties
  • Learn about likelihood-based analysis techniques for fitting models
  • Explore minimization algorithms that accommodate complex uncertainty estimates
  • Investigate the GSHL formula for cosmic proton flux predictions and its fitting process
USEFUL FOR

Researchers and analysts in high-energy physics, particularly those working with cosmic ray data, statisticians involved in uncertainty quantification, and anyone interested in advanced statistical modeling techniques.

Daaavde
Messages
29
Reaction score
0
Hello everyone, I'm currently building the covariance matrix of a large dataset in order to calculate the Chi-Squared. The covariance matrix has this form:

\begin{bmatrix}
\sigma^2_{1, stat} + \sigma^2_{1, syst} & \rho_{12} \sigma_{1,syst} \sigma_{2, syst} & ... \\
\rho_{12} \sigma_{1,syst} \sigma_{2, syst} & \sigma^2_{2, stat} + \sigma^2_{2, syst} & ... \\
... & ... & ...
\end{bmatrix}

However, all my data points have asymmetrix uncertainties (d^{+ \sigma^+_n}_{- \sigma^-_n}) where (\sigma^+_n \neq \sigma^-_n).
How do I calculate the Chi-Squared in this case?
 
Physics news on Phys.org
If your uncertainties are asymmetric, reducing them to two numbers can be dangerous because you probably don't have a perfect Gaussian distribution of the likelihood towards each side separately. You could use the uncertainty that applies in your case (pick the one for the right direction), but a likelihood-based analysis might be better.
 
I thought about picking the uncertainty that applies to the different cases (lower uncertainty if fit lower than data point or viceversa), but the problem is that I'm running the covariance matrix in a minimizer to find the best fit parameters for my test formula.

Currently I'm generating my matrix (500x500) outside the minimizer (the minimizer loop the values of the parameters of my fit formula, so that only the difference vectors need to be recalculated at each iteration), but picking the right uncertainties to use in building the covariance matrix would mean constructing a different covariance matrix at each iteration. Is there a way to avoid that?

I'm interested in the likelihood-based analysis you mentioned, how would it solve the asymmetric uncertainty problem?
 
Where do your uncertainties come from and what do you fit how?
Likelihood
 
My uncertainties are systematic and statistical uncertainties on datapoints representing the flux of cosmic protons as a function of energy. The systematic uncertainties come from different factors related to the detector, resolution and MC.

I'm currently performing a global fit including different experiments measuring the flux of cosmic protons. In order to do that I'm comparing a formula (GSHL) predicting the flux of cosmic protons with the actual data (their difference is the numerator of my Chi-Squared). The cosmic ray formula depends on four parameters. By minimizing the Chi-Squared (looping through different values of the four parameters) I intend to determine the best fit values for the four parameters and their relative uncertainties.
 
The minimizer probably uses this covariance matrix to produce a likelihood estimate, and maximizes this likelihodd (more likely: minimizes the negative logarithm of it). Approaches I see:
- use symmetric uncertainties to get an estimate accurate enough to know which direction your deviation has for each bin, then plug in the correct direction and re-run. Should work if the asymmetries are not too large.
- Figure out if your minimization program allows to calculate the likelihood externally, where you can pick the right direction in every iteration.

The second approach also allows to include more complex uncertainty estimates. The asymmetric errors are problably just an approximation to a more complex likelihood function, and directly using this function would be more accurate.
 
If there are an infinite number of natural numbers, and an infinite number of fractions in between any two natural numbers, and an infinite number of fractions in between any two of those fractions, and an infinite number of fractions in between any two of those fractions, and an infinite number of fractions in between any two of those fractions, and... then that must mean that there are not only infinite infinities, but an infinite number of those infinities. and an infinite number of those...

Similar threads

  • · Replies 11 ·
Replies
11
Views
3K
Replies
13
Views
2K
  • · Replies 4 ·
Replies
4
Views
2K
  • · Replies 11 ·
Replies
11
Views
2K
  • · Replies 1 ·
Replies
1
Views
1K
  • · Replies 2 ·
Replies
2
Views
2K
  • · Replies 13 ·
Replies
13
Views
5K
  • · Replies 5 ·
Replies
5
Views
4K
  • · Replies 6 ·
Replies
6
Views
2K
  • · Replies 2 ·
Replies
2
Views
2K