# Curve-fitting y=x^p+C

1. Nov 6, 2011

### ballzac

Curve-fitting y=Ax^p+C

Howdy,

So I have some data that I suspect to follow a

$$y=x^p+C$$

relationship, where p and C are unknown, real numbers. The y values contain some uncertainties, so I want to use a least squares (or similar) method to fit a curve and quantify the goodness of fit. I actually have a value for y(0) most of the time, so what I have been doing is using that as my value for C, and plotting

$$\ln(y-y(0))[\tex] versus [tex]\ln x$$

from which I get the gradient p using standard linear regression. This works fine for the most part, but it involves the erroneous assumption that there is no uncertainty in the value for C. It also exaggerates the deviations from this model near x=0, which is problematic when the choice of C is not that great. I thought I could maybe vary C and minimise the residuals of the linearised plot and obtain a best value for C that way, but I'm thinking there's probably a more elegant way.

This seems like it must be a really common issue, but I can't find anything about it anywhere. Any ideas? Cheers :)

Last edited: Nov 6, 2011
2. Nov 6, 2011

### Number Nine

Just fit a non linear regression model $y = B_0 + x^{B_1}$. Most statistical packages (and certainly R) will have that capability. The least squares estimates will still work fine under the usual assumptions.

3. Nov 6, 2011

### DrStupid

If you need a linear regression you may try

$\int\limits_{x_0 }^x {y \cdot dx} = \frac{{x \cdot y}}{{p + 1}} + \frac{{p \cdot C \cdot x}}{{p + 1}} - \frac{{x_0 \cdot \left( {y_0 - C} \right)}}{{p + 1}} - C$

4. Nov 6, 2011

### AlephZero

It's your data so you can try to fit type of curve you like, but think about what your equation means physically.

You are saying that $x$ and $y$ are measured in units such that $x^p$ and $y$ have the same units. That is a very strange coincidence, especially if $p$ is an arbitrary real number.

It would make a lot more physical sense to fit a curve like $y = Ax^p + C$, or writing the same thing a different way, $y = e^{ax} + C,$ where the constants $A$ or $a$ are units-dependent.

5. Nov 6, 2011

### ballzac

Thanks for the advice everyone. You are right AlephZero, I forgot to include A, though units aren't an issue because x and y are both dimensionless. So I really have 3 unknowns that I need to work out.

6. Nov 6, 2011

### Number Nine

One more unknown isn't an issue, since your model is a pretty standard non-linear regression model. Most statistics packages will let you fit it, but you'll need to provide reasonable starting values (non-linear least squares estimation uses iterative methods, like Newton's method, so you need to provide starting values for the parameters. They don't have to be super accurate; just reasonable). Remember to do the usual diagnostics afterwards (is error variance constant/normal? etc)

7. Nov 6, 2011

### ballzac

Yep, I'm just checking out the nlinfit function in matlab. It seems to be the way to go. The 'guesses' shouldn't be a problem because I can use my 'bad' data from the attempted linear regression as a starting point.

Thanks heaps for you help. :)

8. Nov 7, 2011

### JJacquelin

Hello ballzac,

Fitting y=a+b(x^c) to experimental data (x,y) can be carried out thanks to various non-linear regression methods which require recursive processes.
Nevertheless, a non-recursive process was published in the paper :
"Régressions et équations intégrales" (in French, but the process itself is short and legible page 17)
http://www.scribd.com/JJacquelin/documents

9. Nov 7, 2011

### Redbelly98

Staff Emeritus
I'd just like to correct this. Axp is a power function, while eax is exponential. They are not the same thing written in a different way -- I believe you'll agree if you think it over.

10. Nov 7, 2011

### ballzac

Thanks for posting.

I've managed to fit a really nice curve to my data. Plotting the residuals, I see that there is structure that's unaccounted for in my model, but the residuals are very small, so it does provide an accurate 'rule of thumb'.

The curve fits better than what I obtained using a log-log plot with linear regression, but when I subtract the intercept obtained using nonlinear regression and then plot log-log, the data does not look linear, and the fitted line completely misses that first few points. This is because the log-log plot exagerates errors in the small values, and suppresses them in the high values.

I find this interesting because it brings into question some of the previous work I've done where the data went through the origin and I quite happily used linear regression on a log-log plot. Still, this it the standard way these sort of problems are expected to be tackled, and all data is generally presented in a linearised form if possible. I just don't think it's that good if it causes the RMS error in the fitted curve to be higher than it would be if the data wasn't linearised in the first place and if non-linear regression is used instead of linear.

11. Nov 7, 2011

### Number Nine

What do the residuals looks like? Be very weary of non-normality or changing variance; it can completely destroy the credibility of your estimators. When in doubt, run a robust regression model and see if the estimates are consistent.

12. Nov 7, 2011

### ballzac

The two data sets are shown as red squares in the upper two plots, with the fitted curves shown in blue. The residuals are shown for each data point under each respective plot. Note the 10^-3 on the y-axis for the residuals.

For actually finding a model that explains the data, this might not be the best fit, but that might not matter. The purpose of these curves is to motivate the choice of certain experimental parameters in order to minimise error in a particular experiment. There is no real reason to try to extrapolate these data, and given a value between 0 and 8 x10-3, one can quite reliably choose a value for these two parameters.

#### Attached Files:

• ###### untitled.jpg
File size:
24.6 KB
Views:
262
13. Nov 7, 2011

### Number Nine

If you're not doing any inference and are just trying to "see what the curve looks like", then you should be fine. It looks to me like the error variance is increasing quadratically-ish; if you wanted to be rigorous, you might try fitting a weighted least squares model.

14. Nov 7, 2011

### ballzac

I might have to do some reading as I don't know much about this subject, but yes, for my current purposes I think what I've done will suffice.

It actually amazes me how little emphasis there was on data analysis in my undergrad physics studies.

15. Nov 7, 2011

### Number Nine

If you want to familiarize yourself with the basics, I recommend Montgomery's Introduction to Linear Regression Analysis.

16. Nov 7, 2011

### ballzac

Thanks for the recommendation, and thanks for all the help :)

17. Nov 8, 2011

### Mute

If at some point you are actually interested in getting precise estimates for your fit and you want to use a power-law model, you need to be careful how you do things.

For instance, your original attempt, a least squares regression on the logarithm of the data, is very subject to systematic errors in the measurements. I am not sure how the nonlinear curve fit works, but it may have similar problems too.

This paper discusses proper fitting of power-laws to experimental data using maximum likelihood estimators, which is a better method of fitting power laws to data. It is formulated for the case of fitting to power law distributions (which tend to have negative powers), so some modifications may be necessary for your specific case. It also discusses, to some extent other distributions that may also fit power law data, but I don't think it gives a maximum likelihood estimator for those.

The thing about power law fits is that there's lots of experimental data out there that looks like it's maybe power law, so people love to go around haphazardly fitting power laws to things which may or may not actually be power laws, so if you get to the stage where you are interested in actually fitting your data to a model and wanting good estimates for your parameters, this paper is a must-read.

Edit: The paper also has an associated website with some codes (mostly matlab, but some has been written for R or other languages) that do maximum-likelihood estimates of some example data to download.

18. Nov 8, 2011

### ballzac

Thanks for the info and links. Some of my data I have very good reason to believe follows a power law, and I'm planning on deriving the relationship analytically at some stage, but some of it I am just guessing. At this stage, I wouldn't even know where to begin working out what the relationship should be analytically for these.

I will read the paper you suggest and try applying it to the case where I have good reason to believe it follows a power law.

Cheers

19. Nov 9, 2011

### JJacquelin

Hello ballzac,
I have tested the power law y=a+b(X^c) with your data and observed that the fit is rather good.
I used data coming from your graph (post #12) which is not accurate enough. It should be better if you pubished the data on numerical form instead of on a graph.

20. Nov 9, 2011

### ballzac

I'm confused

The plots I posted in #12 included curves fitted using nonlinear regression as well as the associated constants A, p, and C (b, c, and a, respectively, using your convension).