# Confidence Interval for Coefficient of Quadratic Fit

1. Jan 13, 2014

### Office_Shredder

Staff Emeritus
I have a bunch of noisy data points (x,y), and I want to model the data as y = ax2 + bx + c + noise where noise can probably be assumed to be Gaussian, or perhaps uniformly distributed. My data is firmly inside of an interval and I'm only interested in modeling correctly inside of this interval.

My experience is that fitting such a curve can result in significantly different values of a,b and c with only a very small change in the actual curve. For example

http://www.wolframalpha.com/input/?i=plot+y+=++x^2+++50,+y+=+1.3+x^2+-+10x+++100+on+[10,30]

The things of significant interest to me are the value of f(x) which I believe is being modeled quite well with what I am doing (just picking f(x) to be the quadratic fit minimizing the sum of squared errors) , and the value of a itself which is probably not getting modeled very well because of issues like the above. Does anybody know/have thoughts on statistical testing I can do to determine a confidence interval for a given the noisy data?

2. Jan 13, 2014

### Stephen Tashi

The title of the post suggests that you want a confidence interval for $a$.

I've offered an opinion in several threads on the forum about how software packages compute confidence intervals for the parameters of curves in curve fits. Nobody has confirmed or contradicted me yet! The gist of it is that these confidence intervals are "asymptotic linearized confidence intervals". This amounts to writing a linear approximation for the formula that the curve fit algorithm uses to find the parameter in terms of the data. The parameter is regarded as a random variable that is a linear function of the data. Assume the data are independent random variables and you can estimate the variance of the parameter in terms of the variances of the data.

The problem with finding a more respectable way to give a confidence interval for a parameter is that if you regard the parameter as a random variable then, after doing a curve fit, you have a sample for it consisting of 1 single value, namely the value produced by the curve fit algorithm. So how can you estimate the variance of the parameter from a sample of size 1?

If you use a Bayesian approach with prior distributions for the parameters, I think you could estimate a confidence interval by a Monte-Carlo simulation. If you assume the parameters that you got from the curve fit are a good approximation for the "real" parameters and that the means and variances of the data are a good approximation for the means and variances of the phenomena being measured, you could do a Monte-Carlo approximation. It would involve generating data with random errors and fitting a curve to it. Do this many times and you get many simulated samples of the parameters.

3. Jan 13, 2014

### Office_Shredder

Staff Emeritus
Hence everything that came after that part!

This sounds pretty good to me. I googled around and found a couple packages that purport to do something similar to what you are describing. I'll give them a shot and report back.

Reporting back: Matlab has a function called "fit" in which you give it data and a model to try to fit to and you can run the output through "confint" to get confidence intervals at any level of confidence you desire. This is probably good enough, if I need something different I can take your description of how these calculations are done and do it myself by hand tweaking as desired. Thanks!

Last edited: Jan 13, 2014