# Comparing discrete data to a continuous model (1D)

1. Apr 4, 2013

### mikeph

Say I have a model, y = f(x), and ten discrete data points to compare to this model, (x1, y1)...(x10,y10). The normal way would then be to take the residuals and square them to get a quality of fit, ie.

average residuals squared = {[f(x1) - y1]^2 + ... + [f(x10) - y10]^2}/10

I also remember being told that if this value is minimised then the model f(x) is the best estimate of the data, assuming the data contains only Gaussian noise?

Say instead my data were continuous (for whatever reason). Is it an equally rigorous idea to try to minimise the continuous sum of the residual squared? For example if my data is y = g(x), then the continuous version of the residual is

average residual squared = integral of (f(x) - g(x))^2 dx.

Does this make sense, is this the correct approach to comparing a continuous data set and a model?

Thanks

edit- I can maybe put this a better way. Rather than only comparing the data to f(x) at the points where we have measured data, which seems a bit biased to me, why don't we measure it over the entire range of x, and then say "the most we can obtain from our data is that the function looks like a stepwise function with step heights equal to y1, y2,...", and then compute the residual in terms of the area between the model and the stepwise function.

Last edited: Apr 4, 2013
2. Apr 5, 2013

### Stephen Tashi

Asking what the "best" way to fit a model to data is like asking for the best color to paint a room. It isn't a mathematical question unless you precisely define what "best" means to you.

If you precisely define the meaning of "best". then you need a lot of information (or a lot of assumptions) to solve the problem. Otherwise, finding the best way is as futile as tyring to find missing sides and angles of triangle when all you know is one side and one angle.

People often define "best" fit to mean a model that minimizes the sum of the squares of the "errors" or "residuals" between the fitting equation and the data. In the continuous case, some people befine "best" to mean a fit that minimizes the integral of the square of the difference between the fit and a continuous version fo the data.

In the case where the data is data assumed to come from a probablity distribution, people sometimes define the "best" fit to be the one that minimizes the sum of the squared residuals between the fitted cumulative distribution and the cumulative distribution of the data. This is the method that you proposed in your Edit.

The above facts are facts about human behavior and culture, not mathematical theorems. People have written mathematical articles about why least squares turns out to be a good way of defining "best" in real world problems. These articles argue that particular goals and particular assumptions are reasonable models for many real world problems and they show that least squares fitting is best according to those goals and assumptions.