Which error should I calculate?

  • Context: Undergrad 
  • Thread starter Thread starter hermano
  • Start date Start date
  • Tags Tags
    Error
Click For Summary

Discussion Overview

The discussion revolves around the methods for calculating the 'total error' when comparing different models that predict values based on measured data. Participants explore various statistical techniques for quantifying the accuracy of these models, including the sum of squared errors and the sum of absolute errors, while addressing the complexities involved in the modeling process.

Discussion Character

  • Exploratory
  • Technical explanation
  • Debate/contested

Main Points Raised

  • One participant seeks advice on how to calculate a 'total error' to compare the accuracy of different models based on predicted and measured values.
  • Another participant suggests that the choice of error calculation method depends on the specific aspect of error the user is interested in.
  • A participant questions the clarity of the original request, emphasizing the need for precise definitions of what is meant by 'total accuracy' and the importance of the context in which the error is measured.
  • There is a discussion about the implications of using different statistical approaches, such as Least Squares versus Maximum-Likelihood estimation, and how these relate to the user's goals.
  • One participant highlights the importance of understanding potential imprecision in the data and the nature of the measurements involved, suggesting that the model fitting process may need to account for errors in both the dependent and independent variables.
  • A later reply raises concerns about the methodology of combining data from multiple sensors and the implications of this process on the error calculations.

Areas of Agreement / Disagreement

Participants express differing views on the appropriate methods for calculating total error, and there is no consensus on a single approach. The discussion remains unresolved regarding the best technique to use for the user's specific context.

Contextual Notes

Participants note that the problem may involve complexities not fully articulated, such as the nature of the sensor measurements and the assumptions made during data processing. There are also questions about the distribution of data points across the range of x-values and how this might affect the error calculations.

hermano
Messages
38
Reaction score
0
Hi,

I want to compare (statistically) different models which predict the values y at several x-values. Therefore I want to calculate the 'total error' between the exact (measured) y-values and the calculated y-values using different models. My problem is that I'm not sure which method to use to calculate the 'total error' for each model. Should I use the sum of squared errors, the sum of the absolute errors or some other technique?

Thanks
 
Physics news on Phys.org
Bro:
It will depend on what aspect of the error you are interested in.
 
What do you mean whit 'the aspect'? I want to quantify the accuracy of my models by calculating the difference between the predicted values of the model and the measured values for the whole set of points (thus for each x-value). I can easily calculate the error for each x-value, but I want to add all these errors together on a way like the sum of absolute errors or something for the whole set to get a total error which is a number that quantities the total accuracy of my model. The question is: Which method should I use to add all these 'separate' errors together?
 
Well, modeling of processes can be done either from the perspective/approach of
Least Squares, or from Maximum-Likelihood estimation. Ijust wondered what
perspective you are using to get some insight.
 
hermano said:
What do you mean whit 'the aspect'?
The question is what YOU mean by aspect.

I want to quantify the accuracy of my models by calculating the difference between the predicted values of the model and the measured values for the whole set of points (thus for each x-value). I can easily calculate the error for each x-value, but I want to add all these errors together on a way like the sum of absolute errors or something for the whole set to get a total error which is a number that quantities the total accuracy of my model. The question is: Which method should I use to add all these 'separate' errors together?

This expresses an intuitive desire but it is not a well posed mathematical problem. For example, Suppose you have a model F(x) for x values in the range 0 to 100, Is it more or less important to fit the values of x from 0 to 50 than the values from 90 to 100? Do you care about errors as measured by the arithmetic difference between measured and predicted values or do you care about the percentage error? Is an over prediction by 10 as bad as an un-prediction by 10? Is the data that you have equally spaced over all the x values, or do I have a lot of data for one particular subset of those values?

Most importantly, what are you trying to accomplish? Are you looking for a number that "quantifies the total accuracy of your model" to publish in a paper, or in an advertising flyer? Are you trying to do a statistical hypothesis tests that accepts or rejects the model?
 
You put it much more nicely and precisely than I did, Stephen. Many people seem
not to realize the need for specific details of what they want when they make
a request. Nice job!.
 
Hi Stephen and Bacle,

Indeed, I want a number (which reflects the total error) that quantifies the total accuracy of my model so I can compare different models with each other.

I will try to explain my problem:
Lets say that the data I have measured is a rough sine wave in function of the angular position (0 to 2*pi, which is the independent variable x) which I measured with three sensors under three different angular positions. The sample frequency determines the number of data points, let's say that for one revolution this is 1000 equidistant points. I add all these three measurements together (three vectors of 1000 points) and this is my input for my model. With my model I want to separate the data again for each sensor. In order to quantify each model, I want to compute the difference between the measured data of each sensor and the separated data of my model for each sensor. This gives me again three vectors of 1000 points which is the ABSOLUTE error on each angular position for the three sensors. My question is: How can I define/calculate one number for each of these vectors that quantifies the total error of my model?

At the end I want to compare these numbers for each model in order to select the model which gives me the lowest error between the measured and calculated data based on the total error!

I hope it is more clear now to help me with my problem!

Thanks
 
One of the first things to determine is if there is imprecision in the data. In a simplistic view of the world, the model would be [itex]y = f(x)[/itex] and the data would be perfectly accurate. In a slightly more complicated view, data of the form [itex](x_i, y_i)[/itex] has [itex]x_i[/itex] measured perfectly but the [itex]y_i[/itex] have measurement errors. In an even more complicated view, the [itex]x_i[/itex] are not be perfectly accurate either.

For example, models are often fit by defining "best fit" to mean a fit f(x) that minimizes the average of the quantities [itex](y_i - f(x_i))^2[/itex] which doesn't account for any error in the [itex]x_i[/itex]. The different approach of "total least squares" assumes that there are also errors in the [itex]x_i[/itex] measurements. http://en.wikipedia.org/wiki/Total_least_squares

I'm guessing that there are complications in your problem that haven't been explained yet because you speak of "adding" the data from the 3 sensors and then separating it again. If by "adding", you simply mean putting the 3 sets of data into one file, then separating it again seems a trivial operation, so I don't know why you would bother to mention it. It would be best if you explained the actual nature of the sensors and what they measure. Do you process the raw sensor measurements by assuming the sensors are at some known angle relative to the thing they measure when [itex]x_i = 0[/itex]. Is the placing of the 1000 equally spaced angles done by taking measurements equally spaced in time and assuming a constant rate of rotation of something?
 

Similar threads

  • · Replies 5 ·
Replies
5
Views
2K
  • · Replies 4 ·
Replies
4
Views
3K
  • · Replies 31 ·
2
Replies
31
Views
3K
Replies
8
Views
2K
  • · Replies 5 ·
Replies
5
Views
2K
  • · Replies 2 ·
Replies
2
Views
2K
  • · Replies 1 ·
Replies
1
Views
2K
  • · Replies 8 ·
Replies
8
Views
3K
  • · Replies 9 ·
Replies
9
Views
3K
  • · Replies 5 ·
Replies
5
Views
6K