Metric for rating funtion accuracy from R^m to R.

In summary, the speaker is working on a program that generates pseudo random expressions to closely approximate a given data set. They are looking for a way to rate the accuracy of these functions, preferably with a method that takes a list of expected and actual outputs and returns a real number. The speaker suggests using the sum of squared errors or the sum of absolute values of errors as possible methods. They also mention that due to the randomness of the data, there may be a lot of bad solutions and a metric that does not underestimate their badness would be preferred.
  • #1
TylerH
729
0
I'm writing program in which I generate pseudo random expressions with the hope of finding one that closely approximates the given data set. The functions map from R^m (an m-tuple of reals) to a real. What I need is a way to rate the functions by their accuracy. Are there any known methods for doing this? Maybe something from stats.

Ideally this would be a method that would take a list of expected outputs and actual outputs and return a real number. A uniform distribution would be good, but not required.
 
Mathematics news on Phys.org
  • #2
A standard thing is to do something like
[tex] \sum_{j} \left( y_j - z_j \right)^2 [/tex]

where the yj are the actual data that you measured, and the zj are the predictions that you made (this is called the sum of squared errors). It's used mostly because it's easy to work with sums of squares when doing analytical calculations (like trying to minimize it). If you're doing a numerical method, people will often instead use the sum of the absolute values of the errors, especially if they consider the property "makes sure there are no significant outliers at the cost of being off by slightly more on average" to be a negative quality
 
  • #3
I went with the abs value one because, by nature of the fact they're all random, there are going to be a lot of bad solutions. So I thought a metric which doesn't underestimate their badness would be better. Thanks.
 
  • #4
TylerH said:
I went with the abs value one because, by nature of the fact they're all random, there are going to be a lot of bad solutions. So I thought a metric which doesn't underestimate their badness would be better. Thanks.

I don't really understand what this post is trying to get at... do you mean that your data has a lot of noise in it?
 
  • #5


There are several methods that can be used to rate the accuracy of a function mapping from R^m to R. One approach is to use a regression analysis, which involves fitting a mathematical model to the data set and then evaluating the model's performance in predicting the actual outputs. This can provide a numerical measure of the function's accuracy, such as the coefficient of determination (R^2).

Another method is to use a loss function, which measures the discrepancy between the expected outputs and the actual outputs. This can be used to calculate a mean squared error or root mean squared error, which can provide a measure of the function's accuracy.

Additionally, techniques such as cross-validation or holdout validation can be used to evaluate the performance of the function on a separate set of data, which can provide a more robust measure of accuracy.

It is important to note that the choice of metric for rating function accuracy may depend on the specific goals and requirements of your program. It may be helpful to consult with a statistician or data scientist to determine the most appropriate method for your particular application.
 

1. What is a metric for rating function accuracy from R^m to R?

A common metric used for rating function accuracy from R^m to R is the Mean Squared Error (MSE). This metric measures the average squared difference between the predicted values and the actual values. It is calculated by taking the sum of the squared errors and dividing it by the number of data points.

2. How is the MSE calculated?

The MSE is calculated by taking the squared difference between the predicted values and the actual values, summing them up, and then dividing by the number of data points. Mathematically, it can be represented as: MSE = ∑(yi - ȳ)2 / n, where yi is the actual value, ȳ is the predicted value, and n is the number of data points.

3. What is the significance of using MSE as a metric for function accuracy?

MSE is a popular metric for function accuracy because it not only takes into account the magnitude of the errors, but also penalizes larger errors more heavily. This means that a model with a lower MSE is better at predicting values and has a smaller overall error.

4. Are there any limitations to using MSE as a metric for function accuracy?

One limitation of using MSE is that it can be heavily influenced by outliers in the data. Since the errors are squared, large errors can greatly impact the overall MSE value, even if they are rare occurrences. Additionally, MSE may not be the most appropriate metric for all types of data and models, so it is important to consider other metrics as well.

5. Can MSE be used for non-numerical data?

No, MSE is typically used for numerical data, as it calculates the squared difference between values. For non-numerical data, other metrics such as accuracy, precision, or recall may be more appropriate for evaluating model performance.

Similar threads

  • General Math
Replies
1
Views
1K
  • Programming and Computer Science
Replies
1
Views
247
Replies
6
Views
1K
Replies
1
Views
1K
  • Programming and Computer Science
Replies
1
Views
618
Replies
1
Views
1K
Replies
17
Views
3K
  • Programming and Computer Science
Replies
1
Views
1K
Replies
5
Views
2K
Replies
66
Views
4K
Back
Top