Can you give me a least squares example?

hkBattousai · May 6, 2010

Can you give me a "least squares" example?

Assume that, I have a function to estimate like below:

f(x) = a₃x³ + a₂x² + a₁x¹ + a₀x⁰

After several experiments I have obtained these (x, f(x)) pairs:
(x₁, y₁)
(x₂, y₂)
(x₃, y₃)
(x₄, y₄)
(x₅, y₅)
(x₆, y₆)

How can I estimate a₀, a₁, a₂ and a₃?

I searched in Google, there are lots of definition of the theorem, but there is no example. I will be glad if you guys spare your time to help me.

Borek · May 6, 2010

Minimize sum

[tex]\sum (y_i - f(x_i))^2[/tex]

hkBattousai · May 6, 2010

hkBattousai said:

Assume that, I have a function to estimate like below:

f(x) = a₃x³ + a₂x² + a₁x¹ + a₀x⁰

After several experiments I have obtained these (x, f(x)) pairs:
(x₁, y₁)
(x₂, y₂)
(x₃, y₃)
(x₄, y₄)
(x₅, y₅)
(x₆, y₆)

How can I estimate a₀, a₁, a₂ and a₃?

I searched in Google, there are lots of definition of the theorem, but there is no example. I will be glad if you guys spare your time to help me.

Can you please give me a matrix representation?

Experimental input vector:
X = [x₁ x₂ x₃ x₄ x₅ x₆]^T

Output vecotr of the experiment:
Y = [y₁ y₂ y₃ y₄ y₅ y₆]^T

Coefficients of the polynomial in f(x):
A = [a₀ a₁ a₂ a₃]^T
(Or, A = [a₃ a₂ a₁ a₀]^T, please specify which A matrix you choose.)

How can I find the vector A in terms of X and Y experiment result vectors.

Borek · May 6, 2010

To understand the idea there is no need for matrix representation.

Let's say you want to do linear regression, y=ax+b. You have set of pairs (x_i, y_i). You look for a & b such that the sum

[tex]\sum (y_i - ax_i - b)^2[/tex]

has minimum value. Calculate derivatives (d/da, d/db) of the sum, compare them to zero, solve for a & b - and you are done. This is high school math.

Your example - with third degree polynomial - is not linear in x, so I don't think you can use simple vector X for your purposes. But I can be wrong.

hkBattousai · May 6, 2010

^ Thank you for your answer.

I'm a grad-student, one of my courses include this LMS topic. My textbook doesn't explain how the theorem is applied, it just gives the solution in an example. I need to learn the implementation of this theorem by means of matrices. Internet sources give the formal definition of this theorem, unfortunately there is no example.

I will be happy if you could give me a start point.

HallsofIvy · May 6, 2010

Think of [itex]a_3x^3+ a_2x^2+ a_1x+ a_0[/itex] as the matrix product
[tex]\begin{bmatrix}x^3 & x^2 & x & 1 \end{bmatrix}\begin{bmatrix}a_3 \\ a_2 \\ a_1 \ a_0 \end{bmatrix}[/tex]

Since you have 6 data points, you have that repeated 6 times- a matrix product with 6 rows:

[tex]\begin{bmatrix} x_1^3 & x_1^2 & x_1 & 1 \\ x_2^3 & x_2^2 & x_3 & 1 \\ x_3^3 & x_3^2 & x_3 & 1 \\ x_4^3 & x_4^2 & x_4 & 1 \\ x_5^3 & x_5^2 & x_5 & 1 \\ x_6^3 & x_6^2 & x_6 & 1\end{bmatrix}\begin{bmatrix}a_3 \\ a_2 \\ a_1 \\ a_0\end{bmatrix}= \begin{bmatrix}y_1 \\ y_2 \\ y_3 \\ y_4 \\ y_5 \\ y_6\end{bmatrix}[/tex]

Writing that as Ax= y where x, the vector of "a"s, is 4 dimensional, and y is in a six dimensional space, Ax is in a 4 dimensional subspace and that has an exact solution only if y happens to be in that subspace. If it is not, then the "closest" we can get to y is to the projection of y in that vector space. In particular, that means that y- Ax must be orthogonal to that space: <Au, y- Ax>= 0 for all v in [itex]R^4[/itex]. Letting [itex]A^*[/itex] be the adjoint (transpose) of A, [itex]<u, A^*(y- Ax)= 0[/itex].

But now, since that inner product is in [itex]R^4[/itex] and u could be any vector in [itex]R^4[/itex], we must have [itex]A^*(y- Ax)= A^*y- A^*Ax= 0[/itex] or [itex]A^*Ax= A^*y[/itex]. If [itex]A^*A[/itex] has an inverse (which it typically does in problems like this), [itex]x= (A^*A)^{-1}A^*y[/itex] gives the coefficients for the "least squares" cubic approximation.

hkBattousai · May 7, 2010

Sorry for the late reply.

HallsofIvy said:

[tex]\begin{bmatrix} x_1^3 & x_1^2 & x_1 & 1 \\ x_2^3 & x_2^2 & x_3 & 1 \\ x_3^3 & x_3^2 & x_3 & 1 \\ x_4^3 & x_4^2 & x_4 & 1 \\ x_5^3 & x_5^2 & x_5 & 1 \\ x_6^3 & x_6^2 & x_6 & 1\end{bmatrix}\begin{bmatrix}a_3 \\ a_2 \\ a_1 \\ a_0\end{bmatrix}= \begin{bmatrix}y_1 \\ y_2 \\ y_3 \\ y_4 \\ y_5 \\ y_6\end{bmatrix}[/tex]

This equation is XA = Y, isn't it?
Only the Y matrix is given. How can I form the X matrix here?

Thank you so much for your help.

HallsofIvy · May 7, 2010

I'm thinking of the 4 by 6 matrix, made from the [itex]x_i[/itex] as "A" and the column matrix, made from the [itex]a_i[/itex] as "X".

You said, in your original post, that

After several experiments I have obtained these (x, f(x)) pairs:
(x1, y1)
(x2, y2)
(x3, y3)
(x4, y4)
(x5, y5)
(x6, y6)

so you have both [itex]xi[/itex] and [itex]y_i[/itex]. If you were given only the y-values with no corresponding x information, there is no possible way to set up a formula.

hotvette · May 7, 2010

hkBattousai said:

How can I estimate a₀, a₁, a₂ and a₃?

I searched in Google, there are lots of definition of the theorem, but there is no example. I will be glad if you guys spare your time to help me.

Here is an explanation that might be useful.

http://www.personal.psu.edu/jhm/f90/lectures/lsq2.html

The final matrix equation is equivalent to the linear system A^TAx = A^Tb (called normal equations) that can be solved by Gauss Elimination or via matrix factoring techniques (e.g. LU, Cholesky, QR, SVD).

Lord Crc · May 7, 2010

I actually found the description on Mathworld rather good.

hkBattousai · May 15, 2010

I found the explanation of the method in a textbook, and I want to share it here. But since I'm not quite familiar with Latex, I will attach photos instead:
[PLAIN]http://img704.imageshack.us/img704/6940/dscf4205.jpg
I realized that this solution is the same as HallsofIvy offered, I wish I understood what he meant earlier...

Q1) The equation is XA=Y, why don't we just solve it by means of A=X^-1Y, and use A=(X^TX)^-1X^TY instead?

Q2) Is this solution of A an "estimate" or the real A? It is obvious that the solution is just an estimate, but why? The solution of the equation in the picture (XA=Y) is straight forward, after which step we say that the "A" vector is an estimate rather than the real A?

Q3) We use this method to estimate the test results as a polynomial. But do we have to estimate it as a polynomial only? I mean, can we estimate f(x) in term of other kinds functions? The picture below illustrates what I'm trying to ask:
[PLAIN]http://img80.imageshack.us/img80/7320/dscf4204.jpg

HallsofIvy · May 15, 2010

hkBattousai said:

I found the explanation of the method in a textbook, and I want to share it here. But since I'm not quite familiar with Latex, I will attach photos instead:
[PLAIN]http://img704.imageshack.us/img704/6940/dscf4205.jpg
I realized that this solution is the same as HallsofIvy offered, I wish I understood what he meant earlier...

Q1) The equation is XA=Y, why don't we just solve it by means of A=X^-1Y, and use A=(X^TX)^-1X^TY instead?

because X, in general, doesn't have an inverse. Here, you are trying to fit a cubic, with four coefficients, to six points so you have a 6 by 4 matrix. That is not a square matrix and so does not have an inverse. You can always fit a line to two points, a quadratic to three points and a cubic to four points, exactly, because that way you have the same number of coefficients as equations and so have a square matrix that you can invert.

Q2) Is this solution of A an "estimate" or the real A? It is obvious that the solution is just an estimate, but why? The solution of the equation in the picture (XA=Y) is straight forward, after which step we say that the "A" vector is an estimate rather than the real A?

What do you mean by a "real" A? In general there is NO cubic that actually passes through six given points. There is NO "real" A in that sense.

Q3) We use this method to estimate the test results as a polynomial. But do we have to estimate it as a polynomial only? I mean, can we estimate f(x) in term of other kinds functions? The picture below illustrates what I'm trying to ask:
[PLAIN]http://img80.imageshack.us/img80/7320/dscf4204.jpg[/QUOTE]
No, there are many other functions that are generally used- exponential and sine and cosine functions are often used. And, yes, exactly the same formulas apply.

hkBattousai · May 21, 2010

I'm sorry for the late reply.
I don't know why but I didn't receive email notification for your reply this time, though by default I'm receiving email notification of replies.

Anyway,
All your answers are satisfactory for me.
Thank you so much for your help.

joecampbell · Jul 7, 2010

Borek said:

To understand the idea there is no need for matrix representation.

Let's say you want to do linear regression, y=ax+b. You have set of pairs (x_i, y_i). You look for a & b such that the sum

[tex]\sum (y_i - ax_i - b)^2[/tex]

has minimum value. Calculate derivatives (d/da, d/db) of the sum, compare them to zero, solve for a & b - and you are done. This is high school math.

Your example - with third degree polynomial - is not linear in x, so I don't think you can use simple vector X for your purposes. But I can be wrong.

im finding a least squares method on these points. (-10,1),(-10,-1),(10,1),(10,-1)
and I ended up with 4+4(a)squared+400(b)squared=minimum
this might sound retarted to you guys, but what do i do now?

hkBattousai · Jul 8, 2010

joecampbell said:

im finding a least squares method on these points. (-10,1),(-10,-1),(10,1),(10,-1)
and I ended up with 4+4(a)squared+400(b)squared=minimum
this might sound retarted to you guys, but what do i do now?

Why don't you use the general formula in the picture in the image file in post #11?
You only have to modify the X matrix by only including x⁰ and x¹ terms.

Can you give me a least squares example?

Discussion Overview

Discussion Character

Main Points Raised

Areas of Agreement / Disagreement

Contextual Notes

Similar threads

Undergrad The vector to which a dual vector corresponds

Undergrad 2 interpretations of bra-ket expression: equal, & isomorphic, but...

Undergrad Spinor calculus

Undergrad Matrix representation of rank-2 spinors

Undergrad Looking for a paper about spinors

Insights Revisiting the Velocity-Time Function

Insights Remote Operated Gate Control System

Insights AI Enriched Problem Solving

Insights Thinking Outside The Box Versus Knowing What’s In The Box

Insights Why Entangled Photon-Polarization Qubits Violate Bell’s Inequality

Insights Quantum Entanglement is a Kinematic Fact, not a Dynamical Effect