How to Quantify Statistical Significance of Model Deviation from Data Points?

Amanheis · Feb 2, 2011

I need to quantify the statistical significance of how much a model deviates from a given set of data points, and I cannot do a fit.

Let's say the model is a one-parameter description of some time dependent quantity f_a(t). I have data points at n different times including error bars, so p_i = {t_i, f_a(t_i), sigma_i}. The reason I cannot do a fit is that the data is actually the predicted errors for some fiducial value a_0, and the fit would obviously just find a = a₀.

I want to know how far off a couple models with either a = a₁ or a = a₂ are. So in other words, how well does f_a₁(t) fit the data points p_i? Can I rule it out on some confidence level?

It seems difficult to express my problem, I hope it became clear enough. All I am asking for is a hint in the right direction, so if you know some relevant reference, I can just get it in the library. I already calculated the likelihood L(a) but don't really know how to proceed.

Thanks.

EnumaElish · Feb 2, 2011

If you have the predicted (projected) value and the error term for each t, why can't you calculate ("back out") the actual observed value corresponding to that t?

Amanheis · Feb 3, 2011

I don't understand. How am I supposed to do that? The actual observed value won't be available until several years from now. And "observed value" implies that it is observed, I don't see how I can calculate it.

EnumaElish · Feb 3, 2011

So, the error bars you mentioned are forecast (future) errors? That wasn't clear to me.

Can you post your understanding of what these errors are, or what they represent?

Amanheis · Feb 3, 2011

Yes, the error bars are predicted. I am basically using the specifications of the experiment and the Fisher matrix formalism to assess the quality of the constraints that will be put on fiducial values of a set of parameters.

EnumaElish · Feb 3, 2011

Let me make up an example: you are predicting the value of random variable Y for, say February 2012. You have a point estimate, [tex]\hat Y(Feb. 2012)[/tex], and an error "bar" "wrapped around" the point estimate. Have I got it?

Amanheis · Feb 4, 2011

EnumaElish said:

Let me make up an example: you are predicting the value of random variable Y for, say February 2012. You have a point estimate, [tex]\hat Y(Feb. 2012)[/tex], and an error "bar" "wrapped around" the point estimate. Have I got it?

Not quite. We already kind of know the value of Y(Feb 2012). For the sake of clarity let's choose a date in the ast, say the average temperature at some place in the year 1500. Also I will adopt my notation from the original post.
We already kind of know how the temperature behaves over time, depending on a set of parameters, one of them being a=a₀. According to earlier measurements, we expect something like f(1500)=20C, f(1700)=22C and f(1900)=25C.
But we want to constrain it as much as possible and plan an improved experiment. According to my calculations, we are expecting to get error bars of .1C, 0.15C and 0.2C on each of these values after that new experiment. Now we want to see if we are going to be able to test a competing theory (ie. the parameter a has a different value). I want to quantify how much the curve of the new theory with a=a₁ would deviate from the data with the predicted error bars.

EnumaElish · Feb 5, 2011

I can describe how to tell whether the projections based on a₀ are statistically significantly different from a projection based on a₁. The error bars imply a joint distribution of temperatures across years. You can resample data points using this joint distribution and re-estimate f as a function of t based on each (re)sample. This will yield a set of values (an empirical distribution) for the a parameter, with mean = a₀. You can then test whether a₁ is statistically different from the mean, using the empirical distribution of a.

How to Quantify Statistical Significance of Model Deviation from Data Points?

1. How do you compare a theory to data?

2. What are some challenges in comparing theory to data?

3. How important is it to compare theory to data?

4. What is the role of replication in comparing theory to data?

5. Can a theory ever be proven by data?

Similar threads

Hot Threads

Recent Insights