Solving polynomial coefficients to minimize square error

In summary, the conversation discusses a problem related to finding the coefficients of a polynomial that minimizes the square error from a data set. The individual is looking for a method to solve for the polynomial coefficients while being able to make assumptions about certain parameters. They believe that the problem may involve calculus of variation and are considering using Lagrange multipliers. It is also noted that there may be a need to add constraints in order to better fit the underlying true signal.
  • #1
tomizzo
114
2
Hi there,

I'm working on a problem right now that relates to least squares error estimate for polynomial fitting.

I'm aware of techniques (iterative formulas) for finding the coefficients of a polynomial that minimizes the square error from a data set. So for example, for a data set that I believe is a first order polynomial, y(x) = a0 + a1*x, I have a method for finding the coefficients of a0 and a1.

However, what if I were to make an assumption about one of the coefficients, specifically, let's says I want a0 = 0. The question is then, let's find y(x) = a1*x, such that a1 minimizes the square error.

How would I go about solving this? The only method I know attempts to solve for a0 and a1, yet I'm making an assumption about a0 = 0, and am attempting to find a1.

Does anyone know how to solve this problem? Or if there is a general solution to solve this? I believe the problem is really a calculus of variation problems, and I simply do not know the math.

TLDR - What is the method for solving for polynomial coefficients (in a least squares sense), while being able to make assumptions about certain parameters.
 
Physics news on Phys.org
  • #2
tomizzo said:
Hi there,

I'm working on a problem right now that relates to least squares error estimate for polynomial fitting.

I'm aware of techniques (iterative formulas) for finding the coefficients of a polynomial that minimizes the square error from a data set. So for example, for a data set that I believe is a first order polynomial, y(x) = a0 + a1*x, I have a method for finding the coefficients of a0 and a1.

However, what if I were to make an assumption about one of the coefficients, specifically, let's says I want a0 = 0. The question is then, let's find y(x) = a1*x, such that a1 minimizes the square error.

How would I go about solving this? The only method I know attempts to solve for a0 and a1, yet I'm making an assumption about a0 = 0, and am attempting to find a1.
But is this a reasonable assumption (i.e., that a0 = 0)? If you had a set of data for which you could find the line of best fit y(x) = A + Bx, then it seems to me that the line y(x) = kx would not be a good fit. Or are you trying to find the line whose fit is "least bad"?
tomizzo said:
Does anyone know how to solve this problem? Or if there is a general solution to solve this? I believe the problem is really a calculus of variation problems, and I simply do not know the math.

TLDR - What is the method for solving for polynomial coefficients (in a least squares sense), while being able to make assumptions about certain parameters.
 
  • #3
The it becomes a one dimensional problem ##y-a_0 = a_1 x##, but this is not a general form of a line.
 
  • #4
Mark44 said:
But is this a reasonable assumption (i.e., that a0 = 0)? If you had a set of data for which you could find the line of best fit y(x) = A + Bx, then it seems to me that the line y(x) = kx would not be a good fit. Or are you trying to find the line whose fit is "least bad"?
Hi Mark,

Thanks for the response. So I'm definitely not an expert in this area, but here's what I know (or that I believe I know).

You are correct in the sense that I could try to fit y(x) = A + Bx and solve for A and B. The resulting function I calculate would indeed better fit the collected data, however, it would not better fit the underlying true signal if I believe that signal to take on the form y(x) = Bx. Theoretically, I could try to fit an infinitely degree polynomial and would indeed reduce the square error, but it wouldn't fit the actual underly signal any better.

So in my case, I believe that the true underlying signal is of the form y(x) = Bx. Thus I should constraint the function as much as I can... But yeah, you could also say I'm trying to find a line that is 'least bad'.

So this question originally stemmed a from a specific instance, however, I'm now curious about the general case in which you can add constraints. I'm also curious if this a problem where Lagrange multipliers may be of use...
 
  • #5
tomizzo said:
Hi Mark,

Thanks for the response. So I'm definitely not an expert in this area, but here's what I know (or that I believe I know).

You are correct in the sense that I could try to fit y(x) = A + Bx and solve for A and B. The resulting function I calculate would indeed better fit the collected data, however, it would not better fit the underlying true signal if I believe that signal to take on the form y(x) = Bx. Theoretically, I could try to fit an infinitely degree polynomial and would indeed reduce the square error, but it wouldn't fit the actual underly signal any better.

So in my case, I believe that the true underlying signal is of the form y(x) = Bx. Thus I should constraint the function as much as I can... But yeah, you could also say I'm trying to find a line that is 'least bad'.
If, as you said above, y = A + Bx "would indeed better fit the collected data" then why not use this line? If the data actually does lie close to the line y = Bx, then the regression line calculation should show that A is zero or close to it. I don't see the need for forcing A to be zero a priori.
tomizzo said:
So this question originally stemmed a from a specific instance, however, I'm now curious about the general case in which you can add constraints. I'm also curious if this a problem where Lagrange multipliers may be of use...
I don't know. It's been too many years since I've done anything with these, so I can't say. I'm not even sure they are applicable here.
 
  • #6
Mark44 said:
If, as you said above, y = A + Bx "would indeed better fit the collected data" then why not use this line? If the data actually does lie close to the line y = Bx, then the regression line calculation should show that A is zero or close to it. I don't see the need for forcing A to be zero a priori.
I don't know. It's been too many years since I've done anything with these, so I can't say. I'm not even sure they are applicable here.

Yes it would better fit the collected data. However, it would not be a better approximation of the true underlying signal if I know that the signal has the form y = Bx. By trying to use a model of y = A + Bx, my coefficient approximations would be more influenced by noise in the data set.
 
  • #8
tomizzo said:
Does anyone know how to solve this problem? Or if there is a general solution to solve this? I believe the problem is really a calculus of variation problems, and I simply do not know the math.

Least squares fitting of a polynomial to data is the type of problem where you have a function f(A,B,C...) of several variables and you find the extrema of such a function by creating a system of linear equations. Each equation comes from setting the partial derivative of f with respect to one of its arguments equal to zero.

In the case of data ## (x_i, y_i)## and ## f(A) = \sum_{i=1}^n (y_i - Ax_i)^2 ## we have the single equation:

## 0 = \frac{\partial f}{\partial A} = \sum_{i=1}^n (\ (2)(y_i - Ax_i)(x_i) \ ) ##

= ## \sum_{i=1}^n ( 2 x_i y_i - 2A {x_i}^2 ) = 2 \sum_{i=1}^n x_i y_i - 2A \sum_{i=1}^n {x_i}^2 ##

so ## A = \frac{\sum_{i=1}^n x_i y_i}{ \sum_{i=1}^n {x_i}^2} ## is an extremum of ## f ##.
TLDR - What is the method for solving for polynomial coefficients (in a least squares sense), while being able to make assumptions about certain parameters.

It depends on the assumptions, but if you want to assume particular coefficients take particular constant values (such as zero) you don't compute any partial derivatives with respect to those coefficients since they aren't variables.
 
  • #9
Stephen Tashi said:
Least squares fitting of a polynomial to data is the type of problem where you have a function f(A,B,C...) of several variables and you find the extrema of such a function by creating a system of linear equations. Each equation comes from setting the partial derivative of f with respect to one of its arguments equal to zero.

In the case of data ## (x_i, y_i)## and ## f(A) = \sum_{i=1}^n (y_i - Ax_i)^2 ## we have the single equation:

## 0 = \frac{\partial f}{\partial A} = \sum_{i=1}^n (\ (2)(y_i - Ax_i)(x_i) \ ) ##

= ## \sum_{i=1}^n ( 2 x_i y_i - 2A {x_i}^2 ) = 2 \sum_{i=1}^n x_i y_i - 2A \sum_{i=1}^n {x_i}^2 ##

so ## A = \frac{\sum_{i=1}^n x_i y_i}{ \sum_{i=1}^n {x_i}^2} ## is an extremum of ## f ##.

It depends on the assumptions, but if you want to assume particular coefficients take particular constant values (such as zero) you don't compute any partial derivatives with respect to those coefficients since they aren't variables.

Thank you for this derivation. This is the exact result I got, but I didn't have much confidence in it and assumed there was a general formula.

I was also able to stumble across this on the wikipedia page earlier this morning - https://en.wikipedia.org/wiki/Simple_linear_regression#Linear_regression_without_the_intercept_term

This outlines the exact problem and comes to an identical conclusion to our answer.
 

1. What is the purpose of solving polynomial coefficients to minimize square error?

The purpose of solving polynomial coefficients to minimize square error is to find the best-fitting polynomial equation for a given set of data points. This allows us to make accurate predictions and understand the relationship between the variables in the data.

2. How is the square error calculated in this process?

The square error is calculated by taking the difference between the actual data points and the predicted values from the polynomial equation, squaring each difference, and then summing all the squared differences together. The goal is to find the coefficients that minimize this sum of squared errors.

3. What is the most commonly used method for solving polynomial coefficients?

The most commonly used method for solving polynomial coefficients to minimize square error is the method of least squares. This involves finding the values of the coefficients that minimize the sum of squared errors, which can be done using calculus or linear algebra.

4. Can solving polynomial coefficients to minimize square error be applied to any data set?

Yes, this method can be applied to any data set that has a polynomial relationship between the variables. However, it may not always be the most suitable approach for certain types of data, such as data with outliers or non-linear relationships.

5. Is it possible to have multiple sets of coefficients that minimize the square error?

Yes, it is possible to have multiple sets of coefficients that minimize the square error. This is because there may be different polynomial equations that can fit the data equally well or the data may be noisy and have multiple local minima for the sum of squared errors.

Similar threads

  • Set Theory, Logic, Probability, Statistics
Replies
2
Views
1K
  • Set Theory, Logic, Probability, Statistics
Replies
4
Views
1K
  • Set Theory, Logic, Probability, Statistics
Replies
7
Views
473
Replies
12
Views
1K
  • Set Theory, Logic, Probability, Statistics
Replies
6
Views
2K
Replies
1
Views
929
  • Calculus and Beyond Homework Help
Replies
7
Views
496
  • Set Theory, Logic, Probability, Statistics
Replies
3
Views
822
  • Set Theory, Logic, Probability, Statistics
Replies
23
Views
2K
Replies
11
Views
2K
Back
Top