Nonlinear Least Squares Fitting

AI Thread Summary
The discussion focuses on implementing a nonlinear least squares fitting algorithm to find a rotation matrix and Euler angles from two sets of 3D points. The user is attempting to compute the Jacobian and the corresponding updates for the angles but is struggling with the correct formulation of dβ and the convergence of the solution. They initially define dβ as the difference between the transformed points and the rotation applied to the original points, but later question if this should be reversed based on definitions from external sources. Despite their implementation yielding results, they are uncertain about the correctness of their approach and seek clarification on the proper calculations and convergence issues. The conversation highlights the complexities involved in nonlinear optimization and the importance of accurate formulations in achieving convergence.
nschaefe
Messages
12
Reaction score
0
Hello,

So I was hoping to get some help implementing a nonlinear least squares fitting algorithm. Technically this is an extension of my previous thread, however the problem I am having now is correctly computing the algorithm

So the problem definition is this:

Given two sets of n 3D points X_{i} = (X_{1},X_{2}...,X_{n}) and X^{'}_{i} = (X^{'}_{1},X^{'}_{2}...,X^{'}_{n}), where X^{'}_{i} is the result of an applied rotation to X_{i}, find the corresponding rotation matrix and Euler Angles

The formulas I am following are from these two links:
Euler Angles Eqs 71 - 77 and NonLinear Least Squares Fitting

I am going to attempt to follow the convention of those articles. So first off I set up matrices X and X' as such

\begin{bmatrix}<br /> x_{1} &amp; x_{2} &amp; x_{3}... &amp; x_{n} \\<br /> y_{1} &amp; y_{2} &amp; y_{3}... &amp; y_{n} \\<br /> z_{1} &amp; z_{2} &amp; z_{3}... &amp; z_{n} \end{bmatrix}

where x_{i},y_{i},z_{i} are the components of X_{i} / X^{&#039;}_{i}

Next I have set up my rotation matrices as follows:

R_{x}(\theta_{x}) = \begin{bmatrix} 1 &amp; 0 &amp; 0 \\ 0 &amp; cos(\theta_{x}) &amp; -sin(\theta_{x}) \\ 0 &amp; sin(\theta_{x}) &amp; cos(\theta_{x}) \end{bmatrix}

R_{y}(\theta_{y}) = \begin{bmatrix} cos(\theta_{y}) &amp; 0 &amp; sin(\theta_{y}) \\ 0 &amp; 1 &amp; 0 \\ - sin(\theta_{-}) &amp; 0 &amp; cos(\theta_{y}) \end{bmatrix}

R_{z}(\theta_{z}) = \begin{bmatrix} cos(\theta_{z}) &amp; -sin(\theta_{z}) &amp; 0 \\ sin(\theta_{z}) &amp; cos(\theta_{z}) &amp; 0 \\ 0 &amp; 0 &amp; 1 \end{bmatrix}

Multiplied together:

R_{t} = R_{z}*R_{y}*R_{x}

Next take matrix A (which I take to the be rotation matrix R_{t}) and turn it into a column vector called f

Then the Jacobian J of f with respect to \theta_{x}, \theta_{y}, \theta_{z} is computed

J*d\theta = df

This is where the first article stops. Moving to the next article,

J is A, d\theta is d\lambda, and df is d\beta I presume. However, I will keep the original convention.

Finally, you get J^{T}*J*d\theta = J^{T}*d\beta

so d\theta = (J^{T}*J)^{-1}*J^{T}*d\beta
So my questions are these:

1. How do I form d\beta? Right now I am taking my "guess" at the angles (will call this \theta_{xo,yo,zo}) to compute R_{t0}.

Then d\beta = X^{&#039;} - R_{t0}*X, and it is turned into a column vector just like R_t is turned into f. Is this correct?

2. When I compute my new angles, should it be

\theta_{x,y,z} = \theta_{x,y,z} + d\theta or \theta_{x,y,z} = \theta_{x,y,z} - d\theta

I have successfully programmed all of these steps into a VB.NET program, but the solution is not converging and I cannot figure out why.

Any help greatly appreciated. Thanks
 
Last edited:
Mathematics news on Phys.org
So I am realizing I think how I am calculating d\beta / d\ f is incorrect, as this should yield an equation that is a 3x3 matrix which is transformed into the 1x9 column vector. Can someone please explain how to find d\ f? I am assuming it has something to do with this equation A = X^{&#039;}X^{T}(XX^{T})^{-1}. Thanks
 
So I think it figured it out, but it still seems strange. I took d\beta = R_{t}(\theta_{x},\theta_{y},\theta_{z}) - X^{&#039;}X^{T}(XX^{T})^{-1} where \theta_{x}, \theta_{y}, \theta_{z} are updated at each iteration by \theta_{x,y,z} = \theta_{x,y,z} - d\theta_{x,y,z}, and it appears to be working.

However, it seems like it should actually be d\beta = X^{&#039;}X^{T}(XX^{T})^{-1} - R_{t}(\theta_{x},\theta_{y},\theta_{z}) as this matches the definition on the wolfram page d\beta_{i} = y_{i} - f_{i}(\lambda_{1},\lambda_{2},... \lambda_{i}), but the solution refuses to converge unless I write it as above.

Can anyone shed some light on this discrepancy and confirm I am calculating this correctly?
 
Last edited:
Suppose ,instead of the usual x,y coordinate system with an I basis vector along the x -axis and a corresponding j basis vector along the y-axis we instead have a different pair of basis vectors ,call them e and f along their respective axes. I have seen that this is an important subject in maths My question is what physical applications does such a model apply to? I am asking here because I have devoted quite a lot of time in the past to understanding convectors and the dual...
Insights auto threads is broken atm, so I'm manually creating these for new Insight articles. In Dirac’s Principles of Quantum Mechanics published in 1930 he introduced a “convenient notation” he referred to as a “delta function” which he treated as a continuum analog to the discrete Kronecker delta. The Kronecker delta is simply the indexed components of the identity operator in matrix algebra Source: https://www.physicsforums.com/insights/what-exactly-is-diracs-delta-function/ by...
Back
Top