Question about creating a regression model

  • Thread starter Thread starter celery1
  • Start date Start date
  • Tags Tags
    Model Regression
Click For Summary
The discussion centers on the need for formulas and code to perform polynomial regression, specifically quadratic and higher-order polynomial regressions, in the context of a plasma physics research project. The user has successfully implemented linear regression but is seeking guidance on how to extend this to polynomial functions, ideally up to sixth order. They express interest in understanding the mathematical foundations, including the use of the Vandermonde matrix and least squares techniques for solving polynomial regression problems.A key point raised is the importance of context when fitting data to polynomial models. While the user aims to find the best polynomial fit for a dataset representing changes in velocity over time, there is a cautionary note about the potential pitfalls of fitting data without a specific model in mind, as it may lead to misleading conclusions. The user is also looking for practical code examples to facilitate the implementation of these polynomial regression techniques.
celery1
Messages
2
Reaction score
0
Hey, so I've started doing a plasma physics research project and one of the things that I have to do is design a function which approximates a curve based on data points that its fed. So far I found the formula for creating a linear regression, but I'm having trouble finding the formulas for quadratic and higher level polynomial regressions.
I know that a calculator can do it, but I can't find source code in any language to compare it to.

More or less what I'm looking for is either code in some language like this one

y = mx + b
"""
Sx = Sy = Sxx = Sxy = Syy = 0.0
n = len(pairs)
for x,y in pairs:
Sx = Sx + x
Sy = Sy + y
Sxx = Sxx + x*x
Sxy = Sxy + x*y
Syy = Syy + y*y
m = ((n * Sxy) - (Sx * Sy)) / ((n * Sxx) - Sx ** 2)
b = (Sy - (m * Sx)) / n
r = ((n * Sxy) - (Sx * Sy)) / (math.sqrt((n * (Sxx)) - (Sx ** 2)) *
math.sqrt((n * Syy) - (Sy ** 2)))
print("y = %sx + %s" % (m, b))
print("r = %s" % r)
return m, b, r

Where I can just translate it

Except for something which gives the formulas for polynomial functions like

y= ax^2+bx+c
Where this gives the quadratic regression formula
Or simply the formula for calculating quadratic regression, cubic regression and so on. I would like to go as high as sixth order but please if you have anything that would help please post it.
 
Technology news on Phys.org
I gather you are looking to solve for the an in

y = \sum_{n=0}^N a_n\,x^n

Note the equation is linear in terms of the coefficients an: It's still a linear regression. Applying least squares techniques leads to

\begin{bmatrix}<br /> v_0 &amp; v_1 &amp; \cdots &amp; v_n \\<br /> v_1 &amp; v_2 &amp; \cdots &amp; v_{n+1} \\<br /> \vdots &amp; \vdots &amp; \ddots &amp; \vdots \\<br /> v_n &amp; v_{n+1} &amp; \cdots &amp; v_{2n}\end{bmatrix}<br /> \begin{bmatrix} a_0 \\ a_1 \\ \vdots \\ a_n\end{bmatrix} =<br /> \begin{bmatrix} u_0 \\ u_1 \\ \vdots \\ u_n\end{bmatrix}

The matrix on the left is the Vandermonde matrix:

v_k = \sum_{i=1}^M x_i^k

where M is the number of observations and xi is the ith observation. The vector on the right is formed by

u_k =\sum_{i=1}^M y_ix_i^k

There are special techniques for solving the above. Google Vandermonde matrix for more.
 
Hey celery1 and welcome to the forums.

Just a question for you that I feel is important: Do you want to the data to a particular model for a reason or are you just fitting the model to different polynomial models?

The reason I bring this up is because finding a polynomial that gives the best fit may not be a good idea if you want to explain the data.

If on the other hand you had a particular model in mind for a reason (like for example an inverse-square relationship in a gravitational or electromagnetic experiment) and you wanted to test the fit to that particular model, that is one thing because there is context in this scenario.

If you want to just find the best polynomial that fits, then I would have to ask what you are trying to find out, because fitting data to a polynomial without context is dangerous and might give the wrong conclusion.
 
Pretty much its the output from my project which is essentially a change in velocity with respect to time. So essentially its just a bunch of points x,y and I wanted to determine which function fits them the best to describe the data that I'm seeing.
Its the stuff that I posted for the linear regression curve but expanded for other polynomials and then I'm thinking of using the r value to get the best polynomial of them all.

I would like to use the matrix that DH posted but I don't fully understand how I would get the least squares regression from it.
 
Learn If you want to write code for Python Machine learning, AI Statistics/data analysis Scientific research Web application servers Some microcontrollers JavaScript/Node JS/TypeScript Web sites Web application servers C# Games (Unity) Consumer applications (Windows) Business applications C++ Games (Unreal Engine) Operating systems, device drivers Microcontrollers/embedded systems Consumer applications (Linux) Some more tips: Do not learn C++ (or any other dialect of C) as a...

Similar threads

  • · Replies 3 ·
Replies
3
Views
3K
  • · Replies 2 ·
Replies
2
Views
2K
  • · Replies 6 ·
Replies
6
Views
2K
Replies
4
Views
4K
  • · Replies 23 ·
Replies
23
Views
4K
  • · Replies 1 ·
Replies
1
Views
2K
  • · Replies 5 ·
Replies
5
Views
3K
  • · Replies 1 ·
Replies
1
Views
2K
  • · Replies 89 ·
3
Replies
89
Views
6K
  • · Replies 17 ·
Replies
17
Views
3K