Error Propagation: Calc Errors w/ Variance Covariance Matrix

AI Thread Summary
The discussion centers on the confusion surrounding error calculation using the variance-covariance matrix in parameter fitting. It highlights that the one-sigma error for parameters is typically derived from the diagonal elements of the matrix, which raises questions about the treatment of covariance terms. Participants explore the idea of diagonalizing the covariance matrix to achieve independent variables and propose using error propagation for more accurate uncertainty estimates. Concerns are expressed about how to appropriately assign variation to parameters when data is not random. Overall, the conversation seeks clarity on the relationship between least squares fitting, maximum likelihood, and the implications of covariance in error estimation.
0xDEADBEEF
Messages
815
Reaction score
1
I am confused about calculating errors. I have learned if you take the variance covariance matrix \Sigma_{ij} of a fit of function f(x,p) to data for parameters p_i (for example by using Levenberg-Marquart) that the one sigma error interval for p_i is \sigma_{p_i}=\sqrt{\Sigma_{ii}} I only understand this, if there are no covariance terms. Why do we do this? I would have thought a better way to find the error would be to do diagonalize \Sigma, say the diagonal form is \Xi with normalized eigenvectors (\vec{v})_k. Then we would have independent variables that have a Gaussian distribution and one can calculate the error on p_i using error propagation, i.e. \sigma_{p_i} = \sqrt{\sum \Xi_{kk}\left\langle(\vec{v})_k\mid l_i \right\rangle} where \left\langle(\vec{v})_k\mid l_i \right\rangle is the i^\text{th} component of (\vec{v})_k. If this is permissible, is there a name for it?
 
Physics news on Phys.org
0xDEADBEEF said:
I am confused about calculating errors. I have learned if you take the variance covariance matrix \Sigma_{ij} of a fit of function f(x,p) to data for parameters p_i (for example by using Levenberg-Marquart) that the one sigma error interval for p_i is \sigma_{p_i}=\sqrt{\Sigma_{ii}}

It is rather confusing how any process can purport to calculate a standard deviation for the paramters of a fit y = f(x,p) in the case when the data is of the form (x_i,y_i). There is no random sample of the parameters. How can any variation be assigned to them? My best guess is in post #7 of the thread: https://www.physicsforums.com/showthread.php?t=645291&highlight=parameters

I'm not sure what you mean by "the variance covariance matrix \Sigma_{i,j} of a fit of the function f(x,p) to the data for parameters p_i". What is the definition of that matrix?
 
Well I guess that you know the theory better than I do, but the idea is somehow a correspondence between least squares and maximum likelihood.
So you have the sum of the squares of a fit function f(x,p_1,p_2,\dots) to data x_i,y_i

<br /> sq(p_1,p_2,\dots) = \sum_i (f(x_i,p_1,p_2,\dots)-y_i)^2<br />

And the residuals
r_i=f(x_i,p_1,p_2,\dots)-y_i

for some optimal set of parameters p_k that minimizes sq. If the residuals are gaussian then the variance of the residuals times the reciprocal of the Hessian of sq(p_1,p_2,\dots) is somehow a measure of how confident one can be in the fitted parameters and it is also a variance-covariance matrix. This is how I understand it, but if I would really understand the theory I wouldn't be asking questions. Anyhow my question was why one only uses the diagonal elements of that matrix.
 
Can your original question can be considered outside of the context of curve-fitting.

0xDEADBEEF said:
I would have thought a better way to find the error would be to do diagonalize \Sigma, say the diagonal form is \Xi with normalized eigenvectors (\vec{v})_k. Then we would have independent variables that have a Gaussian distribution and one can calculate the error on p_i using error propagation, i.e. \sigma_{p_i} = \sqrt{\sum \Xi_{kk}\left\langle(\vec{v})_k\mid l_i \right\rangle} where \left\langle(\vec{v})_k\mid l_i \right\rangle is the i^\text{th} component of (\vec{v})_k. If this is permissible, is there a name for it?

Suppose the p_i are simply a set of random variables, not necessarily having the meaning of parameters in a curve fit. If the covariance matrix is \Sigma, are you proposing a method to get a different estimate for each \sigma^2_{p_i} than using the diagonal element \Sigma_{i,i} ?
 
Exactly. Maybe the thing I am looking for already has a name. If we have a covariance matrix like this

\Sigma = \left( \begin{matrix} .1&amp;100\\ 100&amp;1000 \end{matrix} \right)

The first parameter is varying very little while the second one is varying a lot. But the second parameter also has a large influence on the first parameter, and it seems to me that this does not get captured if we use .1 as the variance for the first parameter. So I was suggesting to diagonalize the matrix to get independent parameters and then something like error propagation to determine the "real" uncertainty of the first parameter. I tried to make an example but I don't know how to make random numbers with a given covariance matrix.
 
You could use a bivariate normal distribution and try to get the desired covariance matrix.

if you don't want to use the variance of a random variable to define its uncertainty, you'll have to state what definition for uncertainty that you want to use.

The variance of one random variable in a joint distribution, doesn't define a joint confidence interval for several variables. Perhaps you are trying to find a joint confidence interval.
 
I was reading documentation about the soundness and completeness of logic formal systems. Consider the following $$\vdash_S \phi$$ where ##S## is the proof-system making part the formal system and ##\phi## is a wff (well formed formula) of the formal language. Note the blank on left of the turnstile symbol ##\vdash_S##, as far as I can tell it actually represents the empty set. So what does it mean ? I guess it actually means ##\phi## is a theorem of the formal system, i.e. there is a...
Back
Top