Interpreting results of a polynomial fit

In summary, the data fit better when extrapolating it to twice the region it was fitted to which was the third order polynomial. Third order polynomial fit the data better than the first and second order polynomial when extrapolating data. It is dangerous to extrapolate a polynomial fit - I would only feel comfortable doing it if you had some physical insight that lead you to expect that the relationship should be a certain order polynomial.

I am currently working on a gamma ray spectroscopy lab in which i have just fit a polynomial to my calibration points. The calibration points are in a relatively straight line, from x=40 to x=450, and y=34 to y=1300 for the first and last end points respectively. Where X is channel number, and Y is energy. The calibration will change Channel number to energy on the rest of my spectrum graphs.

I noticed while increasing the polynomial's degree from first, to second, to third order that the slope of the line decreased at twice the X value of the last calibration point. Taking it from the perspective of my data, the third order polynomial fit my data better when extrapolating data at twice the region i had fit my line to.

Is this because a third order is inherently a better estimate of extrapolated data because it fits a given data set more accurately? I am afraid i don't understand how increasing the degree of a line that is (supposedly) linear would increase its validity past the fit. Was this just luck?

(Equation of my line, 2.787e+ooo*Ch-5.952e-005*Ch^2, i assume the first term is the first order, and second is the second order term. i have forgotten to printout the graph with the third order fit, but it has a second order term of the same magnitude)

The higher order polynomial is inherently a better fit. Since it is almost linear, the coefficient for the second term turns out to be very small.

yes i understand that a high order power will inherently fit a data set better, but would it extrapolate data better as well or is this just pure chance?

yes i understand that a high order power will inherently fit a data set better, but would it extrapolate data better as well or is this just pure chance?

Actually, you have to be careful with polynomial regression as it tends to "overfit". As you add terms the model will be unduly influenced by outliers. Data that already "looks" linear should probably be modeled with least squares linear regression or maximum likelihood estimation.

Last edited:
It is dangerous to extrapolate a polynomial fit - I would only feel comfortable doing it if you had some physical insight that lead you to expect that the relationship should be a certain order polynomial.

I'm curious why you are extrapolating at all. Do you not collect calibration data over the entire energy range of interest and use all of that data in your fit? You should if you want accurate results. Assuming you do that, you can calculate the least square error for different model orders: constant, linear, quadratic, cubic, ... You will always get less error with higher order, but you will find that at some point increasing the order only provides minimal decrease in error, indicating you are fitting noise. Plotting the least square error versus model order will usually make it obvious when you should stop.

Good luck!

jason

@SW VandeCarr: Thank you, this was very helpful

@JasonRF: i had not calibrated over all the energies no, we used a mixed Eu(154?) radioactive source for calibration and the highest energy radiation line was about 1.2MeV. Additionally, there is a pileup line (a line caused when the detector can "see" two radiation counts because they arrive at the same time, thus they add together) from another source at 2.5MeV. Most of the data that was collected was within the calibration range, but this line was not.

The rest of the data was fitted very nicely to the calibration, within the range of error,but since i was extrapolating so far to estimate the energy of this line, which is weak at best, i was off by a factor of 4% or so. which for extrapolating so far out of my calibration range isn't terrible, but id like to get below 1%.