# Parabolic curve fit to x,y data

1. May 24, 2010

### jkristoff

1. The problem statement, all variables and given/known data

Given a series of measured data points (>1000) x,y find the best fit parabolic curve where the constant A (below) is given.

2. Relevant equations

General 2nd deg equation describes conic sections:
Ax2+Bxy+Cy2+Dx+Ey+F=0

for a parabola B^2=4AC, AND in this case B<>0 (ie: the parabola is rotated)

3. The attempt at a solution

I've looked at orthogonal distance regression algorithms like ODRPACK, but I don't know where to begin with the inputs. What makes this more challenging than simply doing a trendline in Excel is that the parabola is rotated (B<>0) and I need to be able to constrain the first term constant A to a known value. To minimize the orthogonal distances while knowing A, the best fit parabola needs to be rotated and translated.
JK
1. The problem statement, all variables and given/known data

2. Relevant equations

3. The attempt at a solution

2. May 26, 2010

### hotvette

Intriguing problem, I've been pondering it for a couple of days. Sorry to say I don’t necessarily have answers but here are some thoughts that might stimulate discussion.

- Not sure why orthogonal distance regression is needed. Ordinary least squares should be able to be used on a problem like this. Nothing wrong with using ODR but the motivation to do so isn't obvious.

- I tried ODRPACK once but couldn’t figure it out either. Didn’t try very hard though. Solving problems using first principles is much more fun and educational (if you have the interest and time) than using canned programs.

- This could be viewed as a linear least squares problem with a single nonlinear constraint (i.e. B^2 = 4AC)

- Developing the governing equations for least squares with equality constraints is actually quite straightforward (even for parametric equations).

- One possible approach (since A is fixed) is to say Bxy+Cy2+Dx+Ey+F = -Ax2 and just treat the set of linear equations as a black box least squares problem (i.e. you have a known matrix A with more rows than columns, a known vector b, and an unknown vector x) and solve for the coefficients, though it isn’t clear to me what that solution actually represents (i.e. what is really being minimized). Seems to me the constraint could also be included.

- To me the most fundamental approach is to represent the problem parametrically (i.e. x = f(t) and y=g(t)) but it isn’t clear (to me) how to represent a general conic parametrically and how the coefficients of the parametric equations relate to A,B,C,D,E,F.

- Another approach that might bear fruit (you alluded to it) is to take a normal parabola and apply a rotation and translation. You'd want to be careful to understand what the resulting least squares problem really represents.

- Might be helpful to know what kind of class this is for and how much effort is expected. Is this suppose to be a 1hr hw problem or a two week project?

-------------- EDIT ---------------------------------------------------------------

Did a little research.

http://research.microsoft.com/en-us/um/people/awf/ellipse/ellipse-pami.pdf

Turns out that given a general conic f = Ax2+Bxy+Cy2+Dx+Ey+F= 0, minimizing the equation $\sum f^2$ is considered minimizing the squared 'algebraic' distances from the points to the curve. By including the equality constraint B2-4AC = 0 via Lagrange Multiplier, the problem can be solved in a straightforward fashion as a nonlinear least squares problem subject to a single equality constraint.

I made up a problem by rotating a parabola (with 1000 pts), added random noise to the points, set A=1, and solved for B, C, D, E, and F. Convergence was achieved in maybe a dozen iterations. All done wtih Excel using matrix functions (not very sexy, but gets the job done most of the time)

Last edited: May 27, 2010
3. May 28, 2010

### hotvette

After a bit more digging there appear to be two common approaches to least squares fitting of general conics:

1. Minimizing the sum of 'algebraic' distances as described in the edit of my previous post (though I still don't understand what an algebraic distance really is). In this specific problem a nonlinear constraint B2-4AC=0 needs to be included to guarantee that the solution is a parabola. I tried this method on a made up problem and it worked fine.

2. Minimizing the sum of 'geometric' distances, which appear to me to be orthogonal distances. The technique is to fit an ordinary parabola to rotated (and potentially translated) data points. Orthogonal least squares makes sense because the orthogonal distances are preserved in the 'rotated' space, whereas vertical distances (i.e. ordinary least squares) in the normal space make little sense in the rotated space. The problem is nonlinear because the angle is unknown and needs to be determined. I also tried this method on a made up problem and it worked fine (though I still don't know how to analytically determine the coefficients of the general conic using the angle and coefficients of the normal parabola).

Though I couldn't find any references in internet searches, there is I believe a third valid option. Minimize the distances in the normal space that are neither vertical nor orthogonal but will result in vertical distances in the rotated space (essentially an ordinary least squares solution in the rotated space). I think this could be done by adding constraints. Food for thought.

Last edited: May 28, 2010
4. May 28, 2010

### jkristoff

Thanks for your help!!! I am trying your first-reply suggestion using matrices in excel, but having trouble so far. How did you include the equality constraint?
Regarding the second-post approach#2, you can rotate a normal parabola (B=0) symmetrical about the y-axis centered at the origin using COT(2p)=(a-c)/b
where a,b and c are coefficients of the rotated parabola (0<p<PI).
Can you explain a bit further how you set up the made up problem for the orthogonal distance approach?
Thanks,
JK

5. May 28, 2010

### hotvette

The answer depends on how much you know about least squares. If you know how to derive the 'normal' equations associated with linear least squares, adding equality constraints is just an extension by using Langrange Multipliers to alter the equation being minimized. Your case is more complicated because the constraint is nonlinear. Thus, the solution is iterative by guessing at the initial values of the unknowns, solving a 'linearlized' set of equations (including the linearized constraint equations), updating the values of the unknowns, and iterating until desired convergence is achieved.

The attachments shows the linear and nonlinear situations.

#### Attached Files:

File size:
12.6 KB
Views:
211
• ###### NLCLS.jpg
File size:
21.1 KB
Views:
196
Last edited: May 28, 2010
6. May 28, 2010

### hotvette

Orthogonal Least Squares is inherently nonlinear because the values of x that result in an orthogonal vector from the curve to the data point is unknown. This particular problem is more complicated in that you are first rotating the data points (call them x' and y') through an unknown angle t (thus t needs to be included as an unknown).

http://en.wikipedia.org/wiki/Rotation_matrix

x* = x'*cos(t) - y'*sin(t) ....data point
y* = x'*sin(t) + y'*cos(t) ....data point

In orthogonal least squares we minimize the sum of the squared distances from the data points to the curve:

min $\sum$ [ (delta y)2 + (delta x)2 ] = min $\sum$ [ f2 + g2 ]

where

f = ax2 + bx + c - y*
g = x - x*

Making substitutions, we get:

f = ax2 + bx + c - x'*sin(t) - y'*cos(t)
g = x - x'*cos(t) + y'*sin(t)

We next linearize f & g (calling the linearized functions f*and g*) by taking the first two terms of the Taylor series:

$$f* = f_0 + (\partial f / \partial a) \Delta a + (\partial f / \partial b) \Delta b + (\partial f / \partial c) \Delta c + (\partial f / \partial x) \Delta x + (\partial f / \partial t) \Delta t$$

$$g* = g_0 + (\partial g / \partial x) \Delta x + (\partial g / \partial t) \Delta t$$

resulting in the linearized least squares problem

$$min \text{ }F = \sum [ (f*)^2 + (g*)^2 ]$$

To minimize F, take partial derivatives of F with respect to each variable (a,b,c,t,xi) and set equal to zero. The math is straightforward but tedius.

If none of this or the previous post is familiar to you, I'd wonder why such a homework problem would be given. Tools are needed to solve problems.

Last edited: May 28, 2010
7. May 29, 2010

### jkristoff

Thank you very much for your help thus far. This is not to be considered a 1 hour homework problem. I'm involved in an independent study with our student group. We are trying the characterize shapes of liquids measured in rotating containers. The x,y data is generated from a non-contact laser displacement probe. All of what you've mentioned is familiar to me; just not VERY familiar. I'm the old-timer of the group and so the math is a bit rusty at the moment (my old calculus book sits at the bedside table). I've been charged to handle to task of the algorithm because I have the engineering education.
I hope you don't mind that this is not a 1 hour homework problem. The forum guidelines states that questions regarding homework OR independent study may be posted here.
For the moment, I need to catch up with you regarding your last two posts.
Thanks again,
JK