Error Propagation: Calc Errors w/ Variance Covariance Matrix

Click For Summary

Discussion Overview

The discussion revolves around the calculation of errors in the context of fitting functions to data using the variance covariance matrix. Participants explore the implications of using diagonal elements of the matrix for error estimation and consider alternative approaches, including diagonalization and error propagation methods.

Discussion Character

  • Exploratory
  • Technical explanation
  • Debate/contested

Main Points Raised

  • Some participants express confusion about the calculation of errors from the variance covariance matrix \Sigma_{ij}, particularly regarding the assumption that one sigma error intervals can be derived solely from the diagonal elements.
  • There is a proposal to diagonalize the covariance matrix to obtain independent variables, suggesting that this could provide a more accurate method for calculating uncertainties on parameters.
  • One participant questions the validity of assigning variation to parameters in the absence of random samples, seeking clarification on the definition of the variance covariance matrix in this context.
  • Another participant discusses the relationship between least squares fitting and maximum likelihood estimation, mentioning that the variance of residuals relates to confidence in fitted parameters.
  • Concerns are raised about whether using only the diagonal elements of the covariance matrix adequately captures the influence of covariance between parameters.
  • There is a suggestion to use a bivariate normal distribution to achieve a desired covariance matrix, along with a discussion on defining uncertainty in a joint distribution context.

Areas of Agreement / Disagreement

Participants do not reach a consensus on the best method for calculating errors from the variance covariance matrix. Multiple competing views and approaches are presented, indicating ongoing uncertainty and exploration of the topic.

Contextual Notes

Limitations include the potential misunderstanding of the variance covariance matrix's role in error estimation and the lack of clarity on how to define uncertainty in the context of joint distributions.

0xDEADBEEF
Messages
815
Reaction score
1
I am confused about calculating errors. I have learned if you take the variance covariance matrix \Sigma_{ij} of a fit of function f(x,p) to data for parameters p_i (for example by using Levenberg-Marquart) that the one sigma error interval for p_i is \sigma_{p_i}=\sqrt{\Sigma_{ii}} I only understand this, if there are no covariance terms. Why do we do this? I would have thought a better way to find the error would be to do diagonalize \Sigma, say the diagonal form is \Xi with normalized eigenvectors (\vec{v})_k. Then we would have independent variables that have a Gaussian distribution and one can calculate the error on p_i using error propagation, i.e. \sigma_{p_i} = \sqrt{\sum \Xi_{kk}\left\langle(\vec{v})_k\mid l_i \right\rangle} where \left\langle(\vec{v})_k\mid l_i \right\rangle is the i^\text{th} component of (\vec{v})_k. If this is permissible, is there a name for it?
 
Physics news on Phys.org
0xDEADBEEF said:
I am confused about calculating errors. I have learned if you take the variance covariance matrix \Sigma_{ij} of a fit of function f(x,p) to data for parameters p_i (for example by using Levenberg-Marquart) that the one sigma error interval for p_i is \sigma_{p_i}=\sqrt{\Sigma_{ii}}

It is rather confusing how any process can purport to calculate a standard deviation for the paramters of a fit y = f(x,p) in the case when the data is of the form (x_i,y_i). There is no random sample of the parameters. How can any variation be assigned to them? My best guess is in post #7 of the thread: https://www.physicsforums.com/showthread.php?t=645291&highlight=parameters

I'm not sure what you mean by "the variance covariance matrix \Sigma_{i,j} of a fit of the function f(x,p) to the data for parameters p_i". What is the definition of that matrix?
 
Well I guess that you know the theory better than I do, but the idea is somehow a correspondence between least squares and maximum likelihood.
So you have the sum of the squares of a fit function f(x,p_1,p_2,\dots) to data x_i,y_i

<br /> sq(p_1,p_2,\dots) = \sum_i (f(x_i,p_1,p_2,\dots)-y_i)^2<br />

And the residuals
r_i=f(x_i,p_1,p_2,\dots)-y_i

for some optimal set of parameters p_k that minimizes sq. If the residuals are gaussian then the variance of the residuals times the reciprocal of the Hessian of sq(p_1,p_2,\dots) is somehow a measure of how confident one can be in the fitted parameters and it is also a variance-covariance matrix. This is how I understand it, but if I would really understand the theory I wouldn't be asking questions. Anyhow my question was why one only uses the diagonal elements of that matrix.
 
Can your original question can be considered outside of the context of curve-fitting.

0xDEADBEEF said:
I would have thought a better way to find the error would be to do diagonalize \Sigma, say the diagonal form is \Xi with normalized eigenvectors (\vec{v})_k. Then we would have independent variables that have a Gaussian distribution and one can calculate the error on p_i using error propagation, i.e. \sigma_{p_i} = \sqrt{\sum \Xi_{kk}\left\langle(\vec{v})_k\mid l_i \right\rangle} where \left\langle(\vec{v})_k\mid l_i \right\rangle is the i^\text{th} component of (\vec{v})_k. If this is permissible, is there a name for it?

Suppose the p_i are simply a set of random variables, not necessarily having the meaning of parameters in a curve fit. If the covariance matrix is \Sigma, are you proposing a method to get a different estimate for each \sigma^2_{p_i} than using the diagonal element \Sigma_{i,i} ?
 
Exactly. Maybe the thing I am looking for already has a name. If we have a covariance matrix like this

\Sigma = \left( \begin{matrix} .1&amp;100\\ 100&amp;1000 \end{matrix} \right)

The first parameter is varying very little while the second one is varying a lot. But the second parameter also has a large influence on the first parameter, and it seems to me that this does not get captured if we use .1 as the variance for the first parameter. So I was suggesting to diagonalize the matrix to get independent parameters and then something like error propagation to determine the "real" uncertainty of the first parameter. I tried to make an example but I don't know how to make random numbers with a given covariance matrix.
 
You could use a bivariate normal distribution and try to get the desired covariance matrix.

if you don't want to use the variance of a random variable to define its uncertainty, you'll have to state what definition for uncertainty that you want to use.

The variance of one random variable in a joint distribution, doesn't define a joint confidence interval for several variables. Perhaps you are trying to find a joint confidence interval.
 

Similar threads

  • · Replies 4 ·
Replies
4
Views
2K
  • · Replies 1 ·
Replies
1
Views
2K
  • · Replies 1 ·
Replies
1
Views
2K
  • · Replies 1 ·
Replies
1
Views
3K
  • · Replies 3 ·
Replies
3
Views
2K
  • · Replies 2 ·
Replies
2
Views
4K
Replies
11
Views
6K
  • · Replies 1 ·
Replies
1
Views
2K
  • · Replies 3 ·
Replies
3
Views
2K