Non-linear least-squares: Uncertainty estimates including non-fitted parameters

Kazza_765
Messages
170
Reaction score
0
Hi all,

I have a question regarding least-squares, and I'm certain I can't be the first one to encounter it, but I've had no luck searching the literature for a solution. Here it is:

Say we have a non-linear least-squares optimisation problem. We have data points y_i and a model y(x_i;{\bf a}) where \bf a is a vector of parameters we wish to fit to the data. The merit function of course is

\chi^2 = \sum^N_{i=1}[\frac{y_i - y(x_i;{\bf a})}{\sigma_i}]

And we can get our parameter uncertainty estimates from the Hessian matrix.


Now let's say that the model we wish to fit to the data is y(x_i;{\bf a};{\bf b}) where \bf a is still our vector of fitted parameters, and \bf b is a vector of fixed parameters. At this stage nothing has changed, I've just re-written y(x_i) to make the non-fitted parameters explicit.


We can go ahead and get our uncertainties for \bf a in the same manner as before. My question, however, is this:

Suppose there are uncertainties in \bf b. How do we incorporate those uncertainties into the estimated uncertainty of \bf a?




As an example of the problem: Say we wish to fit three lorentzians of known width to some data. The amplitudes and centroids are fitted parameters (ie. \bf a), and the widths are fixed parameters (ie. \bf b). Perhaps we know the widths from theory, or from another experiment.

Using the Hessian matrix we can get an uncertainty in the amplitudes of the lorentzians. However, if there is also uncertainty in the widths then this will increase our uncertainty in the amplitudes. The uncertainty in the width needs to be incorporated into the problem somehow, but I have no idea where to start with it.



I'm sure there must already be a solution out there for a problem like this. I'm just not sure where to find it. Any help would be much appreciated.

Thankyou
 
Mathematics news on Phys.org
Kazza_765 said:
And we can get our parameter uncertainty estimates from the Hessian matrix.

There are various definitions of "uncertainty" and it isn't clear which one you are using. Least squares fitting of a model to data can be done by algorithms that involve computing Hessian matrices, but the only interpretation that I have seen where this bears on "uncertainty" is in a Bayesian formulation (such as described in this PDF http://www.google.com/url?sa=t&source=web&cd=4&ved=0CC8QFjAD&url=http%3A%2F%2Fctr.stanford.edu%2F~jops%2Fuq_workshop%2Fsaturday%2Fghattas.pdf&ei=roRETr6ZDc3UiAK37PjXAQ&usg=AFQjCNGhzT9mkVQ7Dz8nY9xkpwM3tFCZfQ
).

Are you using Bayesian prior distributions for the parameter set 'a'?

You seem to be conflicted about the parameters 'b'. One one hand, you aren't willing to vary them when you fit the model and on the other hand you want to quantify their "uncertainty".
 
Last edited by a moderator:
Stephen Tashi said:
There are various definitions of "uncertainty" and it isn't clear which one you are using. Least squares fitting of a model to data can be done by algorithms that involve computing Hessian matrices, but the only interpretation that I have seen where this bears on "uncertainty" is in a Bayesian formulation (such as described in this PDF http://www.google.com/url?sa=t&source=web&cd=4&ved=0CC8QFjAD&url=http%3A%2F%2Fctr.stanford.edu%2F~jops%2Fuq_workshop%2Fsaturday%2Fghattas.pdf&ei=roRETr6ZDc3UiAK37PjXAQ&usg=AFQjCNGhzT9mkVQ7Dz8nY9xkpwM3tFCZfQ
).

Are you using Bayesian prior distributions for the parameter set 'a'?

I'm not using a Bayesian formulation (at least I don't think I am). The least-squares fitting is using the Levenberg-Marquardt method. I should have said the covariance matrix rather than the Hessian (although, if I recall correctly, one is the inverse of the other divided by two?).

You seem to be conflicted about the parameters 'b'. One one hand, you aren't willing to vary them when you fit the model and on the other hand you want to quantify their "uncertainty".

I perhaps wasn't clear. I don't want to quantify the uncertainty of the parameters b - they already have a particular uncertainty associated with them. Rather, I wish to know how the uncertainty in the parameters b contribute to the uncertainty of the parameters a.

Here is the actual problem I'm trying to solve:

We have an experimental atomic photoemission spectra that consists of overlapping spectra from a number of different transitions. From theory we know what the energies of the transitions are, so we can center a Lorentzian on each theoretical transition and then fit the amplitudes and widths to the experimental data. After doing this, we can work out the relative intensities of the various transitions from the integrated intensities, and from the covariance matrix we have an uncertainty in this value.

Just to clarify - the positions of the Lorentzians (on the x/energy axis) is fixed by theory and is not a fitted parameter.

However, there is still uncertainty in theory. And so there is uncertainty in the position of those lorentzians. This uncertainty in the positions of the lorentzians will have an effect on the uncertainty of the amplitudes, and hence of the branching ratios we are ultimately trying to measure.

So far, all measurements I can find completely neglect that there is an uncertainty in the theoretical positions of the fitted lorentzians and how this impacts the final uncertainty in amplitude ratios.



Maybe I should draw a diagram, I'm not sure if I'm explaining myself particularly clearly. Thanks in advance for your help.
 
Last edited by a moderator:
You didn't define "uncertainty" yet, but from your details, it looks like you are referring to quantum mechanics.

From the 2nd Edtion of Shankar's book, page 128:

The Uncertainty

In any situation described probabilistically, another useful quantity to specify besides the mean is the standard deviation, which measures the average fluctucation around the mean. It is defined as

\triangle \Omega = <(\Omega - <\Omega>)^2>^{\frac{1}{2}}

and often called the root-mean-squared deviation. In quantum mechanics, it is referred to as the uncertainty in \Omega.

I , myself, don't know quantum mechanics, so I don't understand whether your question that can be answered by "ordinary" probability theory or whether it involves eigenstates of operators and other quantum mechanical concepts.

Attempting to interpret your problem in ordinary probability theory, it sounds like you have data from a random variable that is a "mixture" ( i.e. a linear combination) of several probability distributions. You estimate the unknown coefficients in this linear combination by an algorithm that maximizes some measure of fit. Viewing the estimates as random variables, they have certain standard deviations. The distributions involved in the mixture also have standard deviations. You want some sort of combined standard deviation or deviations that combine all the standard deviations involved. Is that roughtly correct?
 
Insights auto threads is broken atm, so I'm manually creating these for new Insight articles. In Dirac’s Principles of Quantum Mechanics published in 1930 he introduced a “convenient notation” he referred to as a “delta function” which he treated as a continuum analog to the discrete Kronecker delta. The Kronecker delta is simply the indexed components of the identity operator in matrix algebra Source: https://www.physicsforums.com/insights/what-exactly-is-diracs-delta-function/ by...
Fermat's Last Theorem has long been one of the most famous mathematical problems, and is now one of the most famous theorems. It simply states that the equation $$ a^n+b^n=c^n $$ has no solutions with positive integers if ##n>2.## It was named after Pierre de Fermat (1607-1665). The problem itself stems from the book Arithmetica by Diophantus of Alexandria. It gained popularity because Fermat noted in his copy "Cubum autem in duos cubos, aut quadratoquadratum in duos quadratoquadratos, et...
I'm interested to know whether the equation $$1 = 2 - \frac{1}{2 - \frac{1}{2 - \cdots}}$$ is true or not. It can be shown easily that if the continued fraction converges, it cannot converge to anything else than 1. It seems that if the continued fraction converges, the convergence is very slow. The apparent slowness of the convergence makes it difficult to estimate the presence of true convergence numerically. At the moment I don't know whether this converges or not.
Back
Top