How can I use least squares to find coefficients for a cubic polynomial?

  • Thread starter ianripping
  • Start date
  • Tags
    Cubic
In summary, the conversation discusses using least squares to find the coefficients of a quadratic or higher-order polynomial, using a data set and minimizing the sum of squares of the difference between the interpolating function and the true values. The process involves setting the first partial derivatives of the sum to zero and solving the resulting system of linear equations. A simplified example is provided for a first-order polynomial. Using programs like Matlab or Octave can make the process easier for larger data sets.
  • #1
ianripping
11
0
Hi.

I know how to find the coefficients of a quadratic with a data set using least sqaures.
I now would like to do with to find a polynomial to the 3rd and 4th term.

Could someone help me with this?
The data set I am using:
(-4.5,0.7)
(-3.2,2.3)
(-1.4,3.8)
(0.8,5)
(2.5,5.5)
(4.1,5.6)

Thanks
 
Mathematics news on Phys.org
  • #2
For a general polynomial interpolating function f of degree D, you have

[tex]f(x) = \sum_{k=0}^D a_k x^k[/tex]

Now, for your N data points [itex](x_i, y_i)[/itex], you want to minimize the squares of the difference between your interpolating function and the true [itex]y_i[/itex] values; i.e., you want to minimize the following:

[tex]S = \sum_{i=1}^N(f(x_i) - y_i)^2[/tex]

Now, f(x) can be considered to be a function of the parameters [itex]a_k[/itex], and thus S can also be considered to be a function of [itex]a_k[/itex]. We want to find the [itex]a_k[/itex] that minimize S. We note that S is positive and quadratic in each of the [itex]a_k[/itex], so we know that we can minimize S by setting all of its first partial derivatives equal to zero:

[tex]{\partial S \over \partial a_n} = 0 \text{ for all n}[/tex]

Expanding the partial derivative:

[tex]{\partial S \over \partial a_n} = {\partial \over \partial a_n} \left[ \sum_{i=1}^N(f(x_i) - y_i)^2 \right] = \sum_{i=1}^N {\partial \over \partial a_n} \left[(f(x_i) - y_i)^2\right] = \sum_{i=1}^N\left[2(f(x_i) - y_i)\left {\partial f \over \partial a_n}\right|_{x_i}\right] = 0[/tex]

We note that for a particular n,

[tex]{\partial f \over \partial a_n} = x^n[/tex]

And so,

[tex]{\partial S \over \partial a_n} = \sum_{i=1}^N\left[2(f(x_i) - y_i)\left {\partial f \over \partial a_n}\right|_{x_i}\right] = \sum_{i=1}^N\left[ 2\left(\sum_{k=0}^D a_k x_i^k - y_i\right)x_i^n \right] = 0[/tex]

If you expand out this sum, you will obtain D+1 linear equations for each of the D+1 [itex]a_k[/itex] (i.e., if D=3 for a 3rd-degree polynomial, then you'll get 4 equations for your 4 unknown polynomial coefficients). Then you just have to solve this system of linear equations.
 
Last edited:
  • #3
Wow, it has been a while sine I have done my A-Level Statistics.
Could you help me up to the point of solving the 4 linear equations?

Could you show me an example / point to a link that shows how to get to the point of having the 4 linear equations?

Thanks
 
  • #4
For a third-degree polynomial, these are your equations:

[tex]\sum_{i=1}^N 2 \left( \sum_{k=0}^3 a_kx_i^k - y_i \right) = 0[/tex]

[tex]\sum_{i=1}^N \left[ 2 \left( \sum_{k=0}^3 a_kx_i^k - y_i \right) x_i \right] = 0[/tex]

[tex]\sum_{i=1}^N \left[ 2 \left( \sum_{k=0}^3 a_kx_i^k - y_i \right) x_i^2 \right] = 0[/tex]

[tex]\sum_{i=1}^N \left[ 2 \left( \sum_{k=0}^3 a_kx_i^k - y_i \right) x_i^3 \right] = 0[/tex]

Now, merely expand out each sum for your N data points, plug in each of your [itex]x_i, y_i[/itex], and solve for [itex]a_0, a_1, a_2, a_3[/itex]. Then your polynomial will be given by

[tex]f(x) = a_3 x^3 + a_2 x^2 + a_1 x + a_0[/tex]

Note that you can get rid of the extra factors of 2 by dividing them out of each equation. :P
 
  • #5
Hi, I am really sorry. I am not following.
I don't quite understand the equations.

Could you put my given data set into one of these 4 equations ans show me the working?

Thanks
 
  • #6
Can anyone help me with this?
 
  • #7
All you do is plug in numbers. I haven't done it for you, because it's tedious. What part of it don't you understand?
 
  • #8
I just don't understand how I plug the numbers in.
I don't remember how the sigma sign works.

Could you show me how you get to the first linear equation?
 
  • #9
[tex]\sum_{k=M}^N g(k)[/tex]

means that you take the sum of g(k) for all k between M and N:

[tex]g(M) + g(M+1) + g(M+2) + \ldots + g(N-1) + g(N)[/tex]
 
  • #10
Maybe the double summation is confusing you Ian. The equations might look a little simpler if you note that the inner sum is nothing more than the polynomial evaluated at x_i.

That is,

[tex]P(x_i) = \sum_{k=0}^3 a_kx_i^k[/tex]

If you make this subsitution you can think of it as simply forming the set of four linear simulataneous equations given by :

[tex]\sum_{i=1}^N x_i^j \left(P(x_i) - y_i \right) = 0[/tex]

for each j in 0..3
 
Last edited:
  • #11
Sorry, I am really confused by this - without any examples with data sets I just can't get my head round it.
 
  • #12
You can avoid having to solve linear equations by using orthogonal polynomials.

Link to a rich text document with sample code:

opls.rtf
 
  • #13
ianripping said:
Sorry, I am really confused by this - without any examples with data sets I just can't get my head round it.

Well this is pretty tedious so I don't think anyone is too keen to write out a big long numerical example, but here's a simplified example for D=1 and N=3 (first order polynomail and only three data points). Hopefully it will help you understand how to apply the above equations in general.

Take the three points as.
(1.0, 2.0)
(2.0, 3.1)
(3.0, 3.9).


We are using a first order fit so, P(x) = ax + b and we need to determine parameters "a" and "b".

So we'll have 2 linear simult equation which we get from writing,

[tex]\sum_{i=1}^N x_i^j \left(P(x_i) - y_i \right) = 0[/tex]

for each j in 0..D, (remember that D=1 in this example because it's first order) so that is one for each of j=0 and j=1.

Eqn 1. (j=0) :

(a*1 + b - 2) + (a*2 + b -3.1) + (a*3 + b -3.9) = 0

Which simplifies to : 6a + 3b = 9Eqn 2. (j=1)

1(a*1 + b - 2) + 2(a*2 + b -3.1) + 3(a*3 + b -3.9) = 0

Which simplifies to : 14a + 6b = 19.9So you find the straight line of best fit (first order polynimial) by solving the those two simult equations.

6a + 3b = 9
14a + 6b = 19.9


These give a=0.95 and b=1.1, so the equation is y = 0.95x + 1.1

BTW. Do you have access to any programs like Matlab or Octave (freeware) to help manilpute large amounts of data. You can make this process very easy, even for higher order examples, with such programs.
 
Last edited:
  • #14
So if i were doing this for 2nd degree, where would my C coefficient come from?
 
  • #15
ianripping said:
So if i were doing this for 2nd degree, where would my C coefficient come from?

It would come quite naturally when you write P(x) for a given x during the above procedure.

When you write P(x) with known values of the coefficients but with x as the variable then it's a polynomial (quadratic cubic etc) in x. However when you follow the above procedure you are writing P(x) with unknown coefficients but a given value of x, in this case it is nothing other than a linear function of the coefficients! The solution above shows you how to set up the correct number of these linear equations in order to solve for the unknown coeficients. Just bite the bullet and have a try, I'm sure it will start to make sense once you start doing it.
 
Last edited:
  • #16
The above example makes perfect sense for determining a straight regression line.
I can't see how to expand this to get a quadratic or even 3rd or 4th degree polynomial regression line.
 
  • #17
An example I found:
Example
Find the least squares quadratic of
best fit for the following data.
x 1 2 3 4
y 1 2 2 2

Sum of n = 4
Sum of x = 10
Sum of x^2 = 30
Sum of x^3 = 100
Sum of x^4 = 354
Sum of y = 7
Sum of xy = 19
Sum of x^2 = 59

so..

4a + 10b + 30c =7
10a + 30b + 100c = 19
30a + 100b + 354c = 59.

solving gives:
y= a+bx+cx^2
y = -0.25 + 1.55x - 0.25x^2

Can anyone help me deduce how I would be able to get values for d and e - so that I can have equations in the form:
y= a+bx+cx^2+dx^3
y= a+bx+cx^2+dx^3+ex^4
Can
 
  • #18
ianripping said:
Find the least squares quadratic of best fit for the following data.
x 1 2 3 4
y 1 2 2 2
Can anyone help me deduce how I would be able to get values for d and e - so that I can have equations in the form:
y= a+bx+cx^2+dx^3

You have 4 values, which will determine a cubic exactly, just plug in the 4 sets of values for x and y,then solve the 4 linear equation. You need more than 4 points for a quartic equation. I've alread posted a link to a document that includes an algorithm and code for implementing poly curve fitting based on orthogonal polynomials (as an intermediate step, a normal polynomial is generated at the end).
 
  • #19
I think I see the problem here...Ian has been taught a cookie-cutter method in class without any of the theory behind it, and rather predictably, he is not able to see how to extend the method to other situations.

Ian, let's start with Uart's equation, because I think it is easier to read than mine:

[tex]\sum_{i=1}^N x_i^j \left(P(x_i) - y_i \right) = 0[/tex]

(Remember, this represents N different equations, one for each value of j from 0 to N-1).

Now, watch as I multiply some things and rearrange terms:

[tex]\sum_{i=1}^N \left(x_i^j P(x_i) - x_i^j y_i \right) = 0[/tex]

[tex]\sum_{i=1}^N x_i^j P(x_i) - \sum_{i=1}^N x_i^j y_i = 0[/tex]

[tex]\sum_{i=1}^N x_i^j P(x_i) = \sum_{i=1}^N x_i^j y_i[/tex]

Now, recall that

[tex]P(x_i) = \sum_{k=0}^3 a_kx_i^k[/tex]

So, we can just make this substitution for P(x_i):

[tex]\sum_{i=1}^N x_i^j \left( \sum_{k=0}^3 a_kx_i^k \right) = \sum_{i=1}^N x_i^j y_i[/tex]

Now, doing some more algebra:

[tex]\sum_{i=1}^N \sum_{k=0}^3 \left(x_i^j a_k x_i^k \right) = \sum_{i=1}^N x_i^j y_i[/tex]

[tex]\sum_{i=1}^N \sum_{k=0}^3 \left(a_k x_i^{j+k} \right) = \sum_{i=1}^N x_i^j y_i[/tex]

Finally, we can interchange the order of the summations on the left side, to get:

[tex]\sum_{k=0}^3 \sum_{i=1}^N \left(a_k x_i^{j+k} \right) = \sum_{i=1}^N x_i^j y_i[/tex]

[tex]\sum_{k=0}^3 \left(a_k \sum_{i=1}^N x_i^{j+k} \right) = \sum_{i=1}^N x_i^j y_i[/tex]

This should be in a form similar to the equations you've been using for the linear and quadratic cases.
 
  • #20
If I understand correctly, you want to fit a cubic polynomial to six data points [itex](x_i, y_i)[/itex]. The equation to be solved (for f) is [itex]A^TAf = A^Tg[/itex], where:

[tex]y = a + bx + cx^2 + dx^3[/tex] is the fitting function.

[tex]A = \left[\begin{matrix}1 & x_1 & x_1^2 & x_1^3 \\ 1 & x_2 & x_2^2 & x_2^3 \\ 1 & x_3 & x_3^2 & x_3^3 \\ 1 & x_4 & x_4^2 & x_4^3 \\ 1 & x_5 & x_5^2 & x_5^3 \\ 1 & x_6 & x_6^2 & x_6^3 \end{matrix}\right] \ \ \ f = \left[\begin{matrix}a \\ b \\ c \\ d \end{matrix}\right] \ \ \ g = \left[\begin{matrix}y_1 \\ y_2 \\ y_3 \\ y_4 \\ y_5 \\ y_6 \end{matrix}\right][/tex]
 
Last edited:
  • #21
Sorry if you are getting bored of me - I still can't extend my method to get the a 3rd degree polynomial.
Could someone look back at my example and possibly help me move this on?

Thanks
 
  • #22
If you would actually sit down with the formulas and try to do them, I'm sure you would figure it out. Actually plug in your numbers and write out the summations. Or use Hotvette's method, which is the most explicit equation anyone's given you.

At this point, you're just dragging your feet and begging other people to do your work for you. If you don't understand the math that's involved, then stop complaining and make an effort to learn it.
 
  • #23
Last edited:

1. What is the least squares method in cubic regression?

The least squares method is a statistical technique used to find the best-fit line or curve through a set of data points. In cubic regression, it is used to find the cubic polynomial that minimizes the sum of the squared distances between the data points and the curve.

2. How is the sum of squared errors calculated in cubic regression?

The sum of squared errors (SSE) is calculated by taking the difference between each data point and the corresponding point on the fitted cubic curve, squaring each difference, and then summing all these squared errors together. The goal is to minimize this value to find the best-fitting cubic curve.

3. What is the difference between cubic regression and linear regression?

Cubic regression is a type of polynomial regression, which uses a higher degree polynomial (in this case, a cubic function) to fit the data, while linear regression uses a straight line to fit the data. Cubic regression is more flexible and can capture more complex relationships between the variables, but it can also be more prone to overfitting.

4. How is the cubic regression model evaluated?

The cubic regression model is evaluated by looking at the coefficient of determination (R-squared), which measures the proportion of the variation in the dependent variable that is explained by the independent variables. A higher R-squared value indicates a better fit of the model to the data.

5. What are some potential limitations of using cubic regression?

One potential limitation of using cubic regression is that it assumes a specific shape of the curve (a cubic polynomial) and may not be appropriate for all types of data. It can also be prone to overfitting if there are not enough data points to support the complexity of the model. Additionally, it may be difficult to interpret the coefficients and make meaningful predictions from the model.

Similar threads

Replies
19
Views
2K
Replies
3
Views
2K
Replies
9
Views
2K
  • Set Theory, Logic, Probability, Statistics
Replies
4
Views
1K
  • Set Theory, Logic, Probability, Statistics
Replies
7
Views
472
Replies
4
Views
1K
  • Set Theory, Logic, Probability, Statistics
Replies
23
Views
2K
Replies
3
Views
2K
  • Programming and Computer Science
Replies
4
Views
632
Back
Top