# Question about creating a regression model

1. Dec 6, 2011

### celery1

Hey, so I've started doing a plasma physics research project and one of the things that I have to do is design a function which approximates a curve based on data points that its fed. So far I found the formula for creating a linear regression, but I'm having trouble finding the formulas for quadratic and higher level polynomial regressions.
I know that a calculator can do it, but I can't find source code in any language to compare it to.

More or less what I'm looking for is either code in some language like this one

y = mx + b
"""
Sx = Sy = Sxx = Sxy = Syy = 0.0
n = len(pairs)
for x,y in pairs:
Sx = Sx + x
Sy = Sy + y
Sxx = Sxx + x*x
Sxy = Sxy + x*y
Syy = Syy + y*y
m = ((n * Sxy) - (Sx * Sy)) / ((n * Sxx) - Sx ** 2)
b = (Sy - (m * Sx)) / n
r = ((n * Sxy) - (Sx * Sy)) / (math.sqrt((n * (Sxx)) - (Sx ** 2)) *
math.sqrt((n * Syy) - (Sy ** 2)))
print("y = %sx + %s" % (m, b))
print("r = %s" % r)
return m, b, r

Where I can just translate it

Except for something which gives the formulas for polynomial functions like

y= ax^2+bx+c
Where this gives the quadratic regression formula
Or simply the formula for calculating quadratic regression, cubic regression and so on. I would like to go as high as sixth order but please if you have anything that would help please post it.

2. Dec 6, 2011

### D H

Staff Emeritus
I gather you are looking to solve for the an in

$$y = \sum_{n=0}^N a_n\,x^n$$

Note the equation is linear in terms of the coefficients an: It's still a linear regression. Applying least squares techniques leads to

$$\begin{bmatrix} v_0 & v_1 & \cdots & v_n \\ v_1 & v_2 & \cdots & v_{n+1} \\ \vdots & \vdots & \ddots & \vdots \\ v_n & v_{n+1} & \cdots & v_{2n}\end{bmatrix} \begin{bmatrix} a_0 \\ a_1 \\ \vdots \\ a_n\end{bmatrix} = \begin{bmatrix} u_0 \\ u_1 \\ \vdots \\ u_n\end{bmatrix}$$

The matrix on the left is the Vandermonde matrix:

$$v_k = \sum_{i=1}^M x_i^k$$

where M is the number of observations and xi is the ith observation. The vector on the right is formed by

$$u_k =\sum_{i=1}^M y_ix_i^k$$

There are special techniques for solving the above. Google Vandermonde matrix for more.

3. Dec 6, 2011

### chiro

Hey celery1 and welcome to the forums.

Just a question for you that I feel is important: Do you want to the data to a particular model for a reason or are you just fitting the model to different polynomial models?

The reason I bring this up is because finding a polynomial that gives the best fit may not be a good idea if you want to explain the data.

If on the other hand you had a particular model in mind for a reason (like for example an inverse-square relationship in a gravitational or electromagnetic experiment) and you wanted to test the fit to that particular model, that is one thing because there is context in this scenario.

If you want to just find the best polynomial that fits, then I would have to ask what you are trying to find out, because fitting data to a polynomial without context is dangerous and might give the wrong conclusion.

4. Dec 6, 2011

### celery1

Pretty much its the output from my project which is essentially a change in velocity with respect to time. So essentially its just a bunch of points x,y and I wanted to determine which function fits them the best to describe the data that I'm seeing.
Its the stuff that I posted for the linear regression curve but expanded for other polynomials and then I'm thinking of using the r value to get the best polynomial of them all.

I would like to use the matrix that DH posted but I don't fully understand how I would get the least squares regression from it.