Python 2.7: Fit a model to data

In summary, the conversation involves a person trying to determine how well a known function fits to a set of data. They have tried using curve_fit and linregress, but neither give them exactly what they are looking for. They have also considered using polyfit, but prefer to work with the loglog of the data. They are looking for a python function that can determine the correlation coefficient of two sets of data, taking into account errors for one of the arrays. They are considering using the square of the residuals and the chi2-distribution to get a single number for evaluating the fit.
  • #1
EnSlavingBlair
36
6
Hi,

I'm trying to get how well a known function fits to a set of data. I'm not interested in the data's line of best fit or anything, I just want to know how close it is to my model. I've tried using curve_fit and linregress but neither really give me what I'm after. My data follows a logarithmic curve, which I've been plotting up on loglog scales to get a gradient of about -4, which is close to my model (-3.9), but I'd like to know exactly how close. Linregress so far is the closest match for what I'm after, as it gives the correlation coefficient, how well the data follows the line of best fit, but it's still not exactly what I want.

def line(x,a,b):
return a*x+b​

x = np.log(range(len(coll_ave)))
x = x[1:] # I've done this to avoid the whole ln(0)=infinity thing
y = np.log(coll_ave[1:])
popt, pcov = curve_fit(line, x, y, sigma=error[1:])
grad, inter, r_value, p_value, std_err = stats.linregress(x, y)​

These give me great info, just not quite what I'm looking for. As far as I'm aware, polyfit doesn't work for linear models, and I'd rather work with the loglog of my data than the raw data, as I know what gradient I'm after, but I have the equation for the model as well, so really it doesn't matter. If there's a numpy or scipy version, that would be great. Or a modification to curve_fit or linregress that would make it work.

Thanks for the help :D
 
Technology news on Phys.org
  • #2
You'll need some quantitative definition of "being close" first. What is the output you want to get?
 
  • #3
I kind of lost you...but it sounds like you want to skip the fitting and evaluate your known curve against the data as if it had been fitted from them...can you just evaluate the square of the residuals, then, just as you would have had to do if you were fitting with a least-square method? But, as mfb said, you are going to need more of a reference to know how good a fit a given residual indicates.
 
  • #4
I guess what I'm asking is if there is a python function that will determine the correlation coefficient of 2 sets of data. Your questions have certainly helped me figure things out. I want something like numpy.corrcoef(), but with the ability to include errors for one of the 1D arrays that are in it.

From http://docs.scipy.org/doc/numpy/reference/generated/numpy.corrcoef.html it takes;
"A 1-D or 2-D array containing multiple variables and observations. Each row of m represents a variable, and each column a single observation of all those variables."
and a few other variables.

However, is it possible to include errors for one of the 1D arrays? One array would represent my 'model', which does not need errors, and the other array would be my data, which comes with errors that I want to take into account. I'm not sure what the maths for something like that would be though, I have not done a great deal of statistics.
 
  • #5
I think the suggestion from gsal should work. Take the square of the residuals, then look up the probability with the chi2-distribution to get a single number.
 

1. What is Python 2.7?

Python 2.7 is an older version of the Python programming language. It was released in 2010 and is no longer actively supported as of January 2020. However, it is still commonly used in legacy systems and for specific projects that require compatibility with older Python code.

2. What does it mean to fit a model to data?

Fitting a model to data refers to the process of finding the best mathematical representation of a dataset. This involves selecting a model that can accurately describe the relationship between the variables in the data and using statistical techniques to estimate the parameters of the model that best fit the data.

3. How do I fit a model to data using Python 2.7?

To fit a model to data using Python 2.7, you will need to use a library or package that provides functions for model fitting. Some popular options include NumPy, SciPy, and scikit-learn. These libraries have built-in functions for various model fitting techniques, such as linear regression, logistic regression, and neural networks.

4. What are some common challenges when fitting a model to data in Python 2.7?

Some common challenges when fitting a model to data in Python 2.7 include compatibility issues with newer versions of Python, limited support for newer machine learning techniques, and potential bugs or errors due to the older code base. It is important to thoroughly test and validate your code to ensure accurate results.

5. Can I still use Python 2.7 for model fitting?

Yes, you can still use Python 2.7 for model fitting, but it is not recommended as it is no longer actively supported. It is recommended to upgrade to a newer version of Python, such as Python 3, which has improved functionality and support for newer techniques. Additionally, many libraries and packages have stopped supporting Python 2.7, making it more difficult to find resources and troubleshoot issues.

Similar threads

  • Programming and Computer Science
Replies
9
Views
2K
  • Programming and Computer Science
Replies
15
Views
1K
  • Programming and Computer Science
Replies
6
Views
6K
  • Programming and Computer Science
Replies
6
Views
2K
  • Programming and Computer Science
Replies
3
Views
1K
  • Electrical Engineering
Replies
2
Views
746
  • Advanced Physics Homework Help
Replies
15
Views
1K
  • Set Theory, Logic, Probability, Statistics
Replies
11
Views
768
  • MATLAB, Maple, Mathematica, LaTeX
Replies
9
Views
1K
  • MATLAB, Maple, Mathematica, LaTeX
Replies
2
Views
2K
Back
Top