Data analysis: 2D Surface fitting from raw data

Click For Summary
SUMMARY

This discussion focuses on generating a 2D surface fitting function from raw data using Python, specifically for the function f(x,y) = Ʃn(an * xn)*Ʃn(bn * yn). The user seeks to calculate derivatives with respect to x and y without relying on expensive software like Mathematica or MATLAB. The solution involves minimizing the square of the error between observed and predicted z values through multivariate polynomial regression. Key tools mentioned include Python, SciPy, and the mathematical principles of differentiation and linear equations.

PREREQUISITES
  • Understanding of multivariate polynomial regression
  • Familiarity with Python programming
  • Knowledge of calculus, specifically differentiation
  • Basic experience with SciPy for scientific computing
NEXT STEPS
  • Implement multivariate polynomial regression using SciPy in Python
  • Learn about the least squares method for error minimization
  • Explore the use of NumPy for handling arrays and mathematical operations
  • Investigate alternative free computer algebra systems like Maxima and Sage
USEFUL FOR

This discussion is beneficial for data scientists, researchers, and programmers interested in performing 2D surface fitting and polynomial regression using Python without relying on commercial software.

lepatterso
Messages
1
Reaction score
0
Hey everybody

I've got a question for the programming savvy. I'm generating data which can generally be described by..

f(x,y) = Ʃn(an * xn)*Ʃn(bn * yn)

Basically what I'm looking at is a data set which shows a surface which looks like about 1/4th of a wonky bowl.

My trouble is that I need to calculate derivatives of f wRT x&y, and do some mathematical magic. To do that, I need to first generate f(x,y) from my raw data. So far, I've always used a canned package like Mathematica or MATLAB to generate that function, but this time I've got to build it from scratch. (Those licenses are expensive!)

Does anyone have any good examples of a fitting routine worked out for a 2d surface? I'm working in python, and am relatively new to no training wheels scientific computing.

Thanks!
 
Technology news on Phys.org
If you have some confidence that your model f(x,y) is good then it isn't too difficult to fit a multivariate polynomial to your data without needing Mathematica.

Your goal is to minimize the square of the error between your observed and predicted z values.

To minimize a square you differentiate the square, set equal to zero and solve for the variables.

Suppose a simplified example where I am going to generate some data points {x,y,2x^2+3y+5} and then pretend I haven't seen that 2,3 or 5: {{3, -3, 14}, {-4, -3, 28}, {-3, -5, 8}, {2, 2, 19}, {-5, -5, 40}}

You want to minimize the sum of (zi - (a xi^2+b yi+c))^2.

Your xi,yi and zi are now constants, your variables are a,b,c.

To differentiate (zi - (a xi^2+b yi+c))^2 with respect to a gives 2(zi - (a xi^2+b yi+c))(-xi^2) and similarly for b and c.

Thus you want to solve

{0 == Sum(2 (zi - (a xi^2 + b yi + c)) (-xi^2)),
0 == Sum(2 (zi - (a xi^2 + b yi + c)) (-yi)),
0 == Sum(2 (zi - (a xi^2 + b yi + c)) (-1))}

where you sum over all your {xi,yi,zi}

That partially simplifies to

{0 == -8(19-4 a-2 b-c)-32(28-16 a+3 b-c)-18(14-9 a+3 b-c)-50(40-25 a+5 b-c)-18(8-9 a+5 b-c),
0 == -4(19-4 a-2 b-c)+6(28-16 a+3 b-c)+6 (14-9 a+3 b-c)+10(40-25 a+5 b-c)+10(8-9 a+5 b-c),
0 == -2(19-4 a-2 b-c)-2(28-16 a+3 b-c)-2 (14-9 a+3 b-c)-2(40-25 a+5 b-c)-2(8-9 a+5 b-c)}

which simplifies to

{0 == -3444 + 2118 a - 474 b + 126 c,
0 == 656 - 474 a + 144 b - 28 c,
0 == -218 + 126 a - 28 b + 10 c}

for the given data set.

Notice that these are n linear equations in n unknowns. That makes solving easy.

Solving that gives a==2, b==3, c==5 which were exactly the coefficients I used to generate the data set in the beginning and then waited to see if they would correctly reappear.

Check every detail of this carefully to make certain I have not made any errors.

If you study this for a while and remember some of your calculus and try this on a few examples where you know what the answer should be so you can flush out any errors or misunderstandings then you should be able to do multivariate polynomial regression on your data without needing Mathematica. In the process you may gain a deeper understanding of why regression works.

I urge you to not think that if a cubic polynomial sort of fit the data then a tenth degree polynomial must be better. But this has gone on long enough.

You could later consider trying Reduce, Maxima, Axiom, Sage or any of the other available "free" computer algebra systems. But you pay for those in trying to figure out where to get them, how to install them, how to use them, where to find people who can help answer questions, etc.
 
Last edited:
Bill's lesson looks good.

But, if you want to start taking advantage of popular algorithms in python, you may want to familiarize yourself with scipy and other packages.
 
We have many threads on AI, which are mostly AI/LLM, e.g,. ChatGPT, Claude, etc. It is important to draw a distinction between AI/LLM and AI/ML/DL, where ML - Machine Learning and DL = Deep Learning. AI is a broad technology; the AI/ML/DL is being developed to handle large data sets, and even seemingly disparate datasets to rapidly evaluated the data and determine the quantitative relationships in order to understand what those relationships (about the variaboles) mean. At the Harvard &...

Similar threads

  • · Replies 4 ·
Replies
4
Views
2K
  • · Replies 8 ·
Replies
8
Views
2K
  • · Replies 5 ·
Replies
5
Views
4K
  • · Replies 15 ·
Replies
15
Views
2K
Replies
2
Views
2K
  • · Replies 15 ·
Replies
15
Views
2K
Replies
22
Views
7K
  • · Replies 4 ·
Replies
4
Views
2K
Replies
8
Views
2K
  • · Replies 19 ·
Replies
19
Views
2K