# Least squares solution to simultaneous equations

1. Aug 7, 2009

### mgb_phys

I am trying to fit a transformation from one set of coordiantes to another.
x' = R + Px + Qy
y' = S - Qx + Py

Where P,Q,R,S are constants, P = scale*cos(rotation). Q=scale*sin(rotation)

There is a well known 'by hand' formula for fitting P,Q,R,S to a set of corresponding points.
But I need to have an error estimate on the fit - so I need a least squares solution.

I'm having trouble working out how to do this for first principles for data sets with x and y in them.
Can anyone point to an example/tutorial/code sample of how to do this ?

2. Aug 7, 2009

Why don't you try to minimize the sum over the distances of (x',y') to their nearest neighbors (x,y) (there are algorithms for finding those)

3. Aug 7, 2009

### hotvette

I was thinking along similiar lines. If y' = f(x,y,R,P,Q) and x' = g(x,y,S,P,Q) then one could minimize:

1/2 sum [f - y']^2 + 1/2 sum [g - x']^2

4. Aug 7, 2009

### Redbelly98

Staff Emeritus
Seems like the right approach to me. This could be done in Excel, using the Solver add-in to minimize the sum.

The 1/2 factors aren't necessary.

5. Aug 7, 2009

### mgb_phys

I found http://mathworld.wolfram.com/LeastSquaresFitting.htm [Broken]
Interstingly the 100year old 'by hand' instructions I had from an old Army surveying manual is almost exactly the same algorithm

Last edited by a moderator: May 4, 2017
6. Aug 7, 2009

### Redbelly98

Staff Emeritus
Last edited by a moderator: May 4, 2017
7. Aug 8, 2009

### hotvette

Maybe Excel Solver could do it, but it would be much more fun to write out the equations and solve them.

8. Aug 8, 2009

### Redbelly98

Staff Emeritus
If one really wants to do it that way, start with hotvette's equation:

To minimize the sum, take partial derivitives w.r.t. P, Q, R, S and set each expression equal to zero. That gives 4 linear equations in 4 unknowns to be solved.

Writing out the sum in full:

χ² = (1/2)∑[(Px + Qy + R - x')2 + (-Qx + Py + S - y')2]

Next, set ∂χ² / ∂P = 0:

χ² / ∂P = ∑[(Px + Qy + R - x')x + (-Qx + Py + S - y')y]
= ∑[(Px² + Qxy + Rx - x'x) + (-Qxy + Py² + Sy - y'y)]

= P[∑x² + ] + Rx + Sy - x'x - y'y = 0

And similarly for ∂χ²/∂Q, ∂χ²/∂R, and ∂χ²/∂S.

9. Aug 8, 2009

### Redbelly98

Staff Emeritus
Not sure how to get r2, or errors in the parameters though.

10. Aug 9, 2009

### hotvette

Continuing the partial differentiation, I get the following:

$$\begin{bmatrix} \sum(x^2+y^2) & 0 & \sum x & \sum y \\ 0 & \sum(x^2+y^2) & \sum y & -\sum x \\ \sum x & \sum y & m & 0 \\ \sum y & -\sum x & 0 & m \end{bmatrix} \begin{bmatrix}P \\ Q \\ R \\ S \end{bmatrix} = \begin{bmatrix} \sum (y'y + x'x) \\ \sum (x'y - y'x) \\ \sum x' \\ \sum y' \end{bmatrix}$$

Re the residual, I think it is just $r^2 = \sum (S-Qx+Py-y')^2 + \sum (R + Px + Qy - x')^2$ which is equivalent to:

$$r^2 = (Az-y')^T (Az-y') + (Bz-x')^T (Bz-x')$$

where:

$$A = \begin{bmatrix} y_1 & -x_1 & 0 & 1 \\ y_2 & -x_2 & 0 & 1 \\ \vdots & \vdots & \vdots & \vdots \\ y_m & -x_m & 0 & 1 \end{bmatrix} \quad B=\begin{bmatrix} x_1 & y_1 & 1 & 0 \\ x_2 & y_2 & 1 & 0 \\ \vdots & \vdots & \vdots & \vdots \\ x_m & y_m & 1 & 0 \end{bmatrix} \quad z = \begin{bmatrix}P \\ Q \\ R \\ S \end{bmatrix} \quad y' = \begin{bmatrix} y_1' \\ y_2' \\ \vdots \\ y_m' \end{bmatrix} \quad x' = \begin{bmatrix} x_1' \\ x_2' \\ \vdots \\ x_m' \end{bmatrix}$$

11. Aug 10, 2009

### Redbelly98

Staff Emeritus
I was thinking of the correlation coefficient r, not the residual sum of squares (often denoted by RSS).

12. Aug 10, 2009

### mgb_phys

I've been looking at it and the problem with getting errors directly from the equations is that the coefficients are so orthogonal. A small error in the sin./cos terms is much more significant than in the origin.

What I did was find the fit and then work out the mismatch for each of the known set of points and then use the statistics of that.

13. Aug 11, 2009

### Redbelly98

Staff Emeritus
Makes sense. Does the mismatch look reasonably Gaussian?

14. Aug 11, 2009

### mgb_phys

Too few points to tell.
In reality the error is likely to be due to an outlier where one match is just completely wrong.
Best approach is some iterative removal of outliers - but the specs call for a statistical measure of accuracy.