Can you give me a least squares example?

Click For Summary

Discussion Overview

The discussion revolves around the application of the least squares method for estimating coefficients of a polynomial function based on experimental data points. Participants seek to understand both the theoretical framework and practical implementation, particularly through matrix representation.

Discussion Character

  • Exploratory
  • Technical explanation
  • Mathematical reasoning
  • Debate/contested

Main Points Raised

  • One participant requests an example of the least squares method for estimating coefficients a0, a1, a2, and a3 in a cubic polynomial function.
  • Another participant suggests minimizing the sum of squared differences between observed and estimated values.
  • A participant provides a matrix representation of the polynomial function and discusses the relationship between the data points and the coefficients.
  • Some participants express the need for a clear understanding of how to form the matrix from the given data points.
  • There is a discussion about the nature of the solution, with one participant questioning whether the coefficients obtained are estimates or exact values.
  • Participants explore the possibility of using other types of functions beyond polynomials for estimation.

Areas of Agreement / Disagreement

Participants generally agree on the need for a matrix approach to solve the least squares problem, but there is no consensus on the interpretation of the coefficients as estimates versus exact values. Additionally, there are differing views on the applicability of the method to various types of functions.

Contextual Notes

Some participants note that the matrix X is not square, which complicates the inversion process and leads to the need for an estimation approach. There are also unresolved questions regarding the assumptions made in the application of the least squares method.

hkBattousai
Messages
64
Reaction score
0
Can you give me a "least squares" example?

Assume that, I have a function to estimate like below:

f(x) = a3x3 + a2x2 + a1x1 + a0x0

After several experiments I have obtained these (x, f(x)) pairs:
(x1, y1)
(x2, y2)
(x3, y3)
(x4, y4)
(x5, y5)
(x6, y6)

How can I estimate a0, a1, a2 and a3?

I searched in Google, there are lots of definition of the theorem, but there is no example. I will be glad if you guys spare your time to help me.
 
Physics news on Phys.org


Minimize sum

[tex]\sum (y_i - f(x_i))^2[/tex]
 


hkBattousai said:
Assume that, I have a function to estimate like below:

f(x) = a3x3 + a2x2 + a1x1 + a0x0

After several experiments I have obtained these (x, f(x)) pairs:
(x1, y1)
(x2, y2)
(x3, y3)
(x4, y4)
(x5, y5)
(x6, y6)

How can I estimate a0, a1, a2 and a3?

I searched in Google, there are lots of definition of the theorem, but there is no example. I will be glad if you guys spare your time to help me.


Can you please give me a matrix representation?

Experimental input vector:
X = [x1 x2 x3 x4 x5 x6]T

Output vecotr of the experiment:
Y = [y1 y2 y3 y4 y5 y6]T

Coefficients of the polynomial in f(x):
A = [a0 a1 a2 a3]T
(Or, A = [a3 a2 a1 a0]T, please specify which A matrix you choose.)


How can I find the vector A in terms of X and Y experiment result vectors.
 


To understand the idea there is no need for matrix representation.

Let's say you want to do linear regression, y=ax+b. You have set of pairs (xi, yi). You look for a & b such that the sum

[tex]\sum (y_i - ax_i - b)^2[/tex]

has minimum value. Calculate derivatives (d/da, d/db) of the sum, compare them to zero, solve for a & b - and you are done. This is high school math.

Your example - with third degree polynomial - is not linear in x, so I don't think you can use simple vector X for your purposes. But I can be wrong.
 


^ Thank you for your answer.

I'm a grad-student, one of my courses include this LMS topic. My textbook doesn't explain how the theorem is applied, it just gives the solution in an example. I need to learn the implementation of this theorem by means of matrices. Internet sources give the formal definition of this theorem, unfortunately there is no example.

I will be happy if you could give me a start point.
 


Think of [itex]a_3x^3+ a_2x^2+ a_1x+ a_0[/itex] as the matrix product
[tex]\begin{bmatrix}x^3 & x^2 & x & 1 \end{bmatrix}\begin{bmatrix}a_3 \\ a_2 \\ a_1 \ a_0 \end{bmatrix}[/tex]

Since you have 6 data points, you have that repeated 6 times- a matrix product with 6 rows:

[tex]\begin{bmatrix} x_1^3 & x_1^2 & x_1 & 1 \\ x_2^3 & x_2^2 & x_3 & 1 \\ x_3^3 & x_3^2 & x_3 & 1 \\ x_4^3 & x_4^2 & x_4 & 1 \\ x_5^3 & x_5^2 & x_5 & 1 \\ x_6^3 & x_6^2 & x_6 & 1\end{bmatrix}\begin{bmatrix}a_3 \\ a_2 \\ a_1 \\ a_0\end{bmatrix}= \begin{bmatrix}y_1 \\ y_2 \\ y_3 \\ y_4 \\ y_5 \\ y_6\end{bmatrix}[/tex]

Writing that as Ax= y where x, the vector of "a"s, is 4 dimensional, and y is in a six dimensional space, Ax is in a 4 dimensional subspace and that has an exact solution only if y happens to be in that subspace. If it is not, then the "closest" we can get to y is to the projection of y in that vector space. In particular, that means that y- Ax must be orthogonal to that space: <Au, y- Ax>= 0 for all v in [itex]R^4[/itex]. Letting [itex]A^*[/itex] be the adjoint (transpose) of A, [itex]<u, A^*(y- Ax)= 0[/itex].

But now, since that inner product is in [itex]R^4[/itex] and u could be any vector in [itex]R^4[/itex], we must have [itex]A^*(y- Ax)= A^*y- A^*Ax= 0[/itex] or [itex]A^*Ax= A^*y[/itex]. If [itex]A^*A[/itex] has an inverse (which it typically does in problems like this), [itex]x= (A^*A)^{-1}A^*y[/itex] gives the coefficients for the "least squares" cubic approximation.
 


Sorry for the late reply.

HallsofIvy said:
[tex]\begin{bmatrix} x_1^3 & x_1^2 & x_1 & 1 \\ x_2^3 & x_2^2 & x_3 & 1 \\ x_3^3 & x_3^2 & x_3 & 1 \\ x_4^3 & x_4^2 & x_4 & 1 \\ x_5^3 & x_5^2 & x_5 & 1 \\ x_6^3 & x_6^2 & x_6 & 1\end{bmatrix}\begin{bmatrix}a_3 \\ a_2 \\ a_1 \\ a_0\end{bmatrix}= \begin{bmatrix}y_1 \\ y_2 \\ y_3 \\ y_4 \\ y_5 \\ y_6\end{bmatrix}[/tex]

This equation is XA = Y, isn't it?
Only the Y matrix is given. How can I form the X matrix here?

Thank you so much for your help.
 


I'm thinking of the 4 by 6 matrix, made from the [itex]x_i[/itex] as "A" and the column matrix, made from the [itex]a_i[/itex] as "X".

You said, in your original post, that
After several experiments I have obtained these (x, f(x)) pairs:
(x1, y1)
(x2, y2)
(x3, y3)
(x4, y4)
(x5, y5)
(x6, y6)

so you have both [itex]xi[/itex] and [itex]y_i[/itex]. If you were given only the y-values with no corresponding x information, there is no possible way to set up a formula.
 


hkBattousai said:
How can I estimate a0, a1, a2 and a3?

I searched in Google, there are lots of definition of the theorem, but there is no example. I will be glad if you guys spare your time to help me.

Here is an explanation that might be useful.

http://www.personal.psu.edu/jhm/f90/lectures/lsq2.html

The final matrix equation is equivalent to the linear system ATAx = ATb (called normal equations) that can be solved by Gauss Elimination or via matrix factoring techniques (e.g. LU, Cholesky, QR, SVD).
 
  • #10


I actually found the description on Mathworld rather good.
 
  • #11


I found the explanation of the method in a textbook, and I want to share it here. But since I'm not quite familiar with Latex, I will attach photos instead:
[PLAIN]http://img704.imageshack.us/img704/6940/dscf4205.jpg
I realized that this solution is the same as HallsofIvy offered, I wish I understood what he meant earlier...

Q1) The equation is XA=Y, why don't we just solve it by means of A=X-1Y, and use A=(XTX)-1XTY instead?

Q2) Is this solution of A an "estimate" or the real A? It is obvious that the solution is just an estimate, but why? The solution of the equation in the picture (XA=Y) is straight forward, after which step we say that the "A" vector is an estimate rather than the real A?

Q3) We use this method to estimate the test results as a polynomial. But do we have to estimate it as a polynomial only? I mean, can we estimate f(x) in term of other kinds functions? The picture below illustrates what I'm trying to ask:
[PLAIN]http://img80.imageshack.us/img80/7320/dscf4204.jpg
 
Last edited by a moderator:
  • #12


hkBattousai said:
I found the explanation of the method in a textbook, and I want to share it here. But since I'm not quite familiar with Latex, I will attach photos instead:
[PLAIN]http://img704.imageshack.us/img704/6940/dscf4205.jpg
I realized that this solution is the same as HallsofIvy offered, I wish I understood what he meant earlier...

Q1) The equation is XA=Y, why don't we just solve it by means of A=X-1Y, and use A=(XTX)-1XTY instead?
because X, in general, doesn't have an inverse. Here, you are trying to fit a cubic, with four coefficients, to six points so you have a 6 by 4 matrix. That is not a square matrix and so does not have an inverse. You can always fit a line to two points, a quadratic to three points and a cubic to four points, exactly, because that way you have the same number of coefficients as equations and so have a square matrix that you can invert.

Q2) Is this solution of A an "estimate" or the real A? It is obvious that the solution is just an estimate, but why? The solution of the equation in the picture (XA=Y) is straight forward, after which step we say that the "A" vector is an estimate rather than the real A?
What do you mean by a "real" A? In general there is NO cubic that actually passes through six given points. There is NO "real" A in that sense.

Q3) We use this method to estimate the test results as a polynomial. But do we have to estimate it as a polynomial only? I mean, can we estimate f(x) in term of other kinds functions? The picture below illustrates what I'm trying to ask:
[PLAIN]http://img80.imageshack.us/img80/7320/dscf4204.jpg[/QUOTE]
No, there are many other functions that are generally used- exponential and sine and cosine functions are often used. And, yes, exactly the same formulas apply.
 
Last edited by a moderator:
  • #13


I'm sorry for the late reply.
I don't know why but I didn't receive email notification for your reply this time, though by default I'm receiving email notification of replies.

Anyway,
All your answers are satisfactory for me.
Thank you so much for your help.
 
  • #14


Borek said:
To understand the idea there is no need for matrix representation.

Let's say you want to do linear regression, y=ax+b. You have set of pairs (xi, yi). You look for a & b such that the sum

[tex]\sum (y_i - ax_i - b)^2[/tex]

has minimum value. Calculate derivatives (d/da, d/db) of the sum, compare them to zero, solve for a & b - and you are done. This is high school math.

Your example - with third degree polynomial - is not linear in x, so I don't think you can use simple vector X for your purposes. But I can be wrong.

im finding a least squares method on these points. (-10,1),(-10,-1),(10,1),(10,-1)
and I ended up with 4+4(a)squared+400(b)squared=minimum
this might sound retarted to you guys, but what do i do now?
 
  • #15


joecampbell said:
im finding a least squares method on these points. (-10,1),(-10,-1),(10,1),(10,-1)
and I ended up with 4+4(a)squared+400(b)squared=minimum
this might sound retarted to you guys, but what do i do now?

Why don't you use the general formula in the picture in the image file in post #11?
You only have to modify the X matrix by only including x0 and x1 terms.
 

Similar threads

  • · Replies 4 ·
Replies
4
Views
2K
  • · Replies 4 ·
Replies
4
Views
2K
  • · Replies 7 ·
Replies
7
Views
3K
  • · Replies 1 ·
Replies
1
Views
3K
  • · Replies 11 ·
Replies
11
Views
6K
  • · Replies 12 ·
Replies
12
Views
3K
  • · Replies 1 ·
Replies
1
Views
2K
  • · Replies 4 ·
Replies
4
Views
3K
  • · Replies 4 ·
Replies
4
Views
12K
  • · Replies 2 ·
Replies
2
Views
5K