Confidence Interval for Coefficient of Quadratic Fit

Office_Shredder
Staff Emeritus
Science Advisor
Gold Member
Messages
5,702
Reaction score
1,587
I have a bunch of noisy data points (x,y), and I want to model the data as y = ax2 + bx + c + noise where noise can probably be assumed to be Gaussian, or perhaps uniformly distributed. My data is firmly inside of an interval and I'm only interested in modeling correctly inside of this interval.

My experience is that fitting such a curve can result in significantly different values of a,b and c with only a very small change in the actual curve. For example

http://www.wolframalpha.com/input/?i=plot+y+=++x^2+++50,+y+=+1.3+x^2+-+10x+++100+on+[10,30]

The things of significant interest to me are the value of f(x) which I believe is being modeled quite well with what I am doing (just picking f(x) to be the quadratic fit minimizing the sum of squared errors) , and the value of a itself which is probably not getting modeled very well because of issues like the above. Does anybody know/have thoughts on statistical testing I can do to determine a confidence interval for a given the noisy data?
 
Physics news on Phys.org
Office_Shredder said:
The things of significant interest to me are the value of f(x)

The title of the post suggests that you want a confidence interval for a.

I've offered an opinion in several threads on the forum about how software packages compute confidence intervals for the parameters of curves in curve fits. Nobody has confirmed or contradicted me yet! The gist of it is that these confidence intervals are "asymptotic linearized confidence intervals". This amounts to writing a linear approximation for the formula that the curve fit algorithm uses to find the parameter in terms of the data. The parameter is regarded as a random variable that is a linear function of the data. Assume the data are independent random variables and you can estimate the variance of the parameter in terms of the variances of the data.

The problem with finding a more respectable way to give a confidence interval for a parameter is that if you regard the parameter as a random variable then, after doing a curve fit, you have a sample for it consisting of 1 single value, namely the value produced by the curve fit algorithm. So how can you estimate the variance of the parameter from a sample of size 1?

If you use a Bayesian approach with prior distributions for the parameters, I think you could estimate a confidence interval by a Monte-Carlo simulation. If you assume the parameters that you got from the curve fit are a good approximation for the "real" parameters and that the means and variances of the data are a good approximation for the means and variances of the phenomena being measured, you could do a Monte-Carlo approximation. It would involve generating data with random errors and fitting a curve to it. Do this many times and you get many simulated samples of the parameters.
 
  • Like
Likes 1 person
Stephen Tashi said:
The title of the post suggests that you want a confidence interval for a.

Hence everything that came after that part!

and the value of a itself which is probably not getting modeled very well because of issues like the above. Does anybody know/have thoughts on statistical testing I can do to determine a confidence interval for a given the noisy data?

I've offered an opinion in several threads on the forum about how software packages compute confidence intervals for the parameters of curves in curve fits. Nobody has confirmed or contradicted me yet! The gist of it is that these confidence intervals are "asymptotic linearized confidence intervals". This amounts to writing a linear approximation for the formula that the curve fit algorithm uses to find the parameter in terms of the data. The parameter is regarded as a random variable that is a linear function of the data. Assume the data are independent random variables and you can estimate the variance of the parameter in terms of the variances of the data.

This sounds pretty good to me. I googled around and found a couple packages that purport to do something similar to what you are describing. I'll give them a shot and report back.

Reporting back: Matlab has a function called "fit" in which you give it data and a model to try to fit to and you can run the output through "confint" to get confidence intervals at any level of confidence you desire. This is probably good enough, if I need something different I can take your description of how these calculations are done and do it myself by hand tweaking as desired. Thanks!
 
Last edited:
Hi all, I've been a roulette player for more than 10 years (although I took time off here and there) and it's only now that I'm trying to understand the physics of the game. Basically my strategy in roulette is to divide the wheel roughly into two halves (let's call them A and B). My theory is that in roulette there will invariably be variance. In other words, if A comes up 5 times in a row, B will be due to come up soon. However I have been proven wrong many times, and I have seen some...
Thread 'Detail of Diagonalization Lemma'
The following is more or less taken from page 6 of C. Smorynski's "Self-Reference and Modal Logic". (Springer, 1985) (I couldn't get raised brackets to indicate codification (Gödel numbering), so I use a box. The overline is assigning a name. The detail I would like clarification on is in the second step in the last line, where we have an m-overlined, and we substitute the expression for m. Are we saying that the name of a coded term is the same as the coded term? Thanks in advance.

Similar threads

Back
Top