# How do you interpolate a set of points into a curve?

Hi,

Very general question, I never really covered this.

Say I have ten data points in 2D space and I need to find a curve which passes THROUGH these points, how do I go about this? I need it to be a curve because I need a function to operate on analytically, not a set of points.

Ordinarily I'd try some least squares fitting, but that won't do here because I really need the curve to cross all the points. Also, I can't use a polynomial fit because the function needs to go to 0 at infinity.

I have it in my head from reading somewhere that this can be done by fitting it to a set of basis functions, eg. a set of Gaussian distributions, and adding them together will make it a lot easier to fit to the points.

Does this have a proper name or something so I can research it further?

Thanks,
Ordinarily I'd go to the library for this sort of thing but my knowledge is so vague I don't have a starting point on this one.

Thanks,

## Answers and Replies

Let the function be a0 + a1x + a2x^2 + .......+a10x^10. Then find the coefficients ai using your data.

I said it can't be polynomial because my function needs to go to zero at infinity. Polynomials will just get massive at infinity instead.

Why can't you add an eleventh point at (say) (1000000, 0)?

Office_Shredder
Staff Emeritus
Science Advisor
Gold Member
The polynomial will still go to infinity.

I think the easiest way would be to find a bump function around each point (take the example here and translate to be at the x coordinate you need, then scale vertically to get the y coordinate)

http://en.wikipedia.org/wiki/Bump_function

And then add them all together.

This will only work if all the points have different x-coordinates. If they don't, you can rotate the plane so that they do and then find your bump functions, and then rotate back. Actually, that sounds like a PITA to calculate by hand

Hi,

Thanks very much, I haven't heard of this type of function but technically the result should have compact support so they seem perfect.

From reading my first post I made it completely unclear that I am trying to work with points in 3D space (in fact I explicitly said 2D which I can't figure out- maybe I was thinking of the domain)- so they are like (xi, yi, zi=f(xi,yi)). But I think this should be a simple extension if I make the argument of this bump function r instead of x.

I don't understand about the rotating the plane part, is this a problem if I have for example two x-coordinates that are so close that nearby bump functions will overlap? (and so the peaks will be higher than my dataset) I ideally want the function to be an interpolation between the functions, for example if I have a point (0,0,1) and (1,1,2) then the point at (0.5,0.5) is somewhere in between 1 and 2, which I can't see happening if I squash the functions into a set of thin peaks.

Sorry if I was unclear

Office_Shredder
Staff Emeritus
Science Advisor
Gold Member
No matter how close two points are with x coordinates, you can scale the bump function horizontally so that the bump functions for each don't overlap. The problem is for example if you have the points (0,1) and (0,2) that you need to pass through. You would set up the bump function as given in wikipedia for the point (0,1), and then you would add the bump function for (0,2), but then you would end up passing through neither point (instead passing through (0,3)), and no amount of scaling will help you.

LCKurtz
Science Advisor
Homework Helper
Gold Member
Hi,

Thanks very much, I haven't heard of this type of function but technically the result should have compact support so they seem perfect.

From reading my first post I made it completely unclear that I am trying to work with points in 3D space (in fact I explicitly said 2D which I can't figure out- maybe I was thinking of the domain)- so they are like (xi, yi, zi=f(xi,yi)). But I think this should be a simple extension if I make the argument of this bump function r instead of x.

I don't understand about the rotating the plane part, is this a problem if I have for example two x-coordinates that are so close that nearby bump functions will overlap? (and so the peaks will be higher than my dataset) I ideally want the function to be an interpolation between the functions, for example if I have a point (0,0,1) and (1,1,2) then the point at (0.5,0.5) is somewhere in between 1 and 2, which I can't see happening if I squash the functions into a set of thin peaks.

Sorry if I was unclear

No matter how close two points are with x coordinates, you can scale the bump function horizontally so that the bump functions for each don't overlap. The problem is for example if you have the points (0,1) and (0,2) that you need to pass through. You would set up the bump function as given in wikipedia for the point (0,1), and then you would add the bump function for (0,2), but then you would end up passing through neither point (instead passing through (0,3)), and no amount of scaling will help you.

I have a couple of questions/comments. In 2D if you use Shredder's idea of making the bump functions really skinny, the resulting function will have compact support and go through your points, but it won't look nice and probably isn't what you want. The reason is it will be mostly around zero and have a bunch of narrow little spikes up to your given points to interpolate the function.

Secondly, if you are really talking about 3D, are you really seeking a curve or do you mean surface?

Sorry yes a surface is a better word.

And in your first paragraph you are exactly right, I need the resulting function to smoothly vary between two data points, not always go down to zero. I guess another constraint is that the data models a field which we assume is continuous, so a fitting scheme which suppresses sharp gradients would be preferred.

Office_Shredder
Staff Emeritus
Science Advisor
Gold Member
Bump functions are smooth, so adding them together will be smooth. Do you mean you want something like the support to be simply connected (no holes), or to otherwise look like a nice set?

Ok, my terminology is causing some problems again, by "smooth" I do not mean continuous (although that is also good), I mean the gradient is minimised.

I can't find the mathematical words to explain this much better: The set of functions which cross all my data points can be ranked according to some measure of their "sharpness", for example the maximum magnitude of the gradient vector of the function. I want to be able to find some of the functions which have a relatively low maximum gradient vector magnitude.

For example the bump functions will have large gradients around the points, and are penalised in this ranking system.

random thought: the heat equation over time tends to smooth harsh edges; if we express the data points in temperatures and fix these over all time, shouldn't the solution to this equation as t---> inf give us the smoothest possible function which crosses all the points?

LCKurtz
Science Advisor
Homework Helper
Gold Member
Mikey, I don't know what level of mathematics you are studying. The link below might be a good place for you to look if you have had enough mathematics. It talks about radial basis functions and how they are used in interpolation. I believe it also gives examples having compact support.

http://www.farfieldtechnology.com/products/toolbox/theory/

Disclaimer: I don't know much about this subject so you are pretty much on your own if you follow up on this link.

Hurkyl
Staff Emeritus
Science Advisor
Gold Member
Ordinarily I'd try some least squares fitting, but that won't do here because I really need the curve to cross all the points. Also, I can't use a polynomial fit because the function needs to go to 0 at infinity.
Does it matter? You said you were interested in interpolation, not extrapolation.

Fitting a rational function is not much different than fitting a polynomial. Those can vanish at infinity.

Also, you could even try fitting a polynomial in both the inputs and outputs match zero as closely as possible, rather than trying to fit a polynomial in the inputs to the output as closely as possible.

Furthermore, "least squares" doesn't imply polynomial. You can take pretty much any function at all with parameters, and use least squares to determine parameters for which the function most closely resembles the data.

Oh wait, I misunderstood -- you want to get things exactly right? What is your application? You might be barking up the wrong tree -- even if you fit the data exactly, there's no guarantee that the resulting function will be exactly what you want at other data points.

Last edited:
I want to fit the data exactly because the data is all I have... I don't have any information on what function to use. It's not my liberty to enforce a certain type of function and calculate a least-squares fit to find the parameters of that function, at the expense of actually missing the points. By doing this I think I have manipulated the data, and I can find no justification for this without a physical model (which I don't have, nor do I have time to develop).

I suppose the only knowledge I have is that the fitting function should not have unnecessarily steep regions, and that it should go to zero at r=infinity, which I can justify because it wouldn't work otherwise.

The reason I am doing this is because I want a function which I can go two ways with:
1. I can sample it at the points I know and get the data I already have
2. I can operate on it.

LCKurtz- thankyou, I was thinking I need a set of basis functions. I'll have a look at this website.

Hurkyl
Staff Emeritus
Science Advisor
Gold Member
It's not my liberty to enforce a certain type of function and calculate a least-squares fit to find the parameters of that function, at the expense of actually missing the points. By doing this I think I have manipulated the data, and I can find no justification for this without a physical model (which I don't have, nor do I have time to develop).
Not knowing the application, I fear that you're going too far in the opposite direction: overfitting the sample data so strongly that you cannot expect to get a decent amount of accuracy when interpolating (or worse, when extrapolating).

Also, there's a technique you can use when you can't get more samples to validate your model -- instead of trying to fit a function to all of the data, you instead fit, say, 75% of the data. Then you can use the remaining 25% to validate the model. (Or maybe even just use 65% and 25% until you're happy with the results. Then, use the remaining 10% to make sure you're still happy with the results)