Python Python 2.7: Fit a model to data

Click For Summary
The discussion revolves around evaluating how well a known logarithmic function fits a set of data, specifically seeking a quantitative measure of closeness rather than just a line of best fit. The user has attempted methods like curve_fit and linregress but finds them insufficient for their needs. They are plotting data on log-log scales and have identified a gradient close to their model but want a more precise comparison. Suggestions include calculating the square of the residuals to assess fit quality and using correlation coefficients. The user seeks a Python function that can compute correlation while accounting for errors in one dataset, expressing uncertainty about the statistical methods involved. The conversation emphasizes the need for a tailored approach to quantify the fit between the model and the data.
EnSlavingBlair
Messages
34
Reaction score
6
Hi,

I'm trying to get how well a known function fits to a set of data. I'm not interested in the data's line of best fit or anything, I just want to know how close it is to my model. I've tried using curve_fit and linregress but neither really give me what I'm after. My data follows a logarithmic curve, which I've been plotting up on loglog scales to get a gradient of about -4, which is close to my model (-3.9), but I'd like to know exactly how close. Linregress so far is the closest match for what I'm after, as it gives the correlation coefficient, how well the data follows the line of best fit, but it's still not exactly what I want.

def line(x,a,b):
return a*x+b​

x = np.log(range(len(coll_ave)))
x = x[1:] # I've done this to avoid the whole ln(0)=infinity thing
y = np.log(coll_ave[1:])
popt, pcov = curve_fit(line, x, y, sigma=error[1:])
grad, inter, r_value, p_value, std_err = stats.linregress(x, y)​

These give me great info, just not quite what I'm looking for. As far as I'm aware, polyfit doesn't work for linear models, and I'd rather work with the loglog of my data than the raw data, as I know what gradient I'm after, but I have the equation for the model as well, so really it doesn't matter. If there's a numpy or scipy version, that would be great. Or a modification to curve_fit or linregress that would make it work.

Thanks for the help :D
 
Technology news on Phys.org
You'll need some quantitative definition of "being close" first. What is the output you want to get?
 
I kind of lost you...but it sounds like you want to skip the fitting and evaluate your known curve against the data as if it had been fitted from them...can you just evaluate the square of the residuals, then, just as you would have had to do if you were fitting with a least-square method? But, as mfb said, you are going to need more of a reference to know how good a fit a given residual indicates.
 
I guess what I'm asking is if there is a python function that will determine the correlation coefficient of 2 sets of data. Your questions have certainly helped me figure things out. I want something like numpy.corrcoef(), but with the ability to include errors for one of the 1D arrays that are in it.

From http://docs.scipy.org/doc/numpy/reference/generated/numpy.corrcoef.html it takes;
"A 1-D or 2-D array containing multiple variables and observations. Each row of m represents a variable, and each column a single observation of all those variables."
and a few other variables.

However, is it possible to include errors for one of the 1D arrays? One array would represent my 'model', which does not need errors, and the other array would be my data, which comes with errors that I want to take into account. I'm not sure what the maths for something like that would be though, I have not done a great deal of statistics.
 
I think the suggestion from gsal should work. Take the square of the residuals, then look up the probability with the chi2-distribution to get a single number.
 
I tried a web search "the loss of programming ", and found an article saying that all aspects of writing, developing, and testing software programs will one day all be handled through artificial intelligence. One must wonder then, who is responsible. WHO is responsible for any problems, bugs, deficiencies, or whatever malfunctions which the programs make their users endure? Things may work wrong however the "wrong" happens. AI needs to fix the problems for the users. Any way to...

Similar threads

  • · Replies 9 ·
Replies
9
Views
3K
  • · Replies 6 ·
Replies
6
Views
7K
  • · Replies 15 ·
Replies
15
Views
2K
  • · Replies 6 ·
Replies
6
Views
3K
  • · Replies 3 ·
Replies
3
Views
2K
Replies
5
Views
2K
  • · Replies 15 ·
Replies
15
Views
2K
  • · Replies 6 ·
Replies
6
Views
2K
Replies
22
Views
7K
  • · Replies 4 ·
Replies
4
Views
4K