Data Fitting with Bezier curve

AI Thread Summary
Fitting a cubic Bezier curve to data points involves determining control points based on parametric equations for x and y, which are decoupled. The challenge lies in finding the appropriate t values for each data point, as the x equation is cubic in t. Initial guesses for t can be made uniformly or through chord-length parameterization, and iterative methods like Newton-Raphson can refine these guesses. Alternative approaches include using stochastic gradient descent to minimize square error by perturbing control points and evaluating the resulting fit. The discussion highlights the complexity of the fitting process and suggests that orthogonal least squares may be a more effective method for parametric equations.
hotvette
Homework Helper
Messages
1,001
Reaction score
11
Does anyone know how to fit a cubic Bezier curve to a given set of data points? If so, I'd appreciate some coaching on the methodology.

Bezier curves have separate equations for x and y in a parametric variable t that varies from 0 to 1:

x = at3 + bt2 + ct + d

y = et3 + ft2 + gt + h

the 8 unknowns are a function of 4 control points:

xc = (x0, x1, x2, x3)
yc = (y0, y1, y2, y3)

thus, for a given set of values for the 4 control points, all 8 coefficients are determined and the x,y values of the Bezier curve can be calculated.

But, the equations are de-coupled in the sense that a,b,c,d depend ONLY on the x-values of the 4 control points (xc) and e,f,g,h depend ONLY on the y-values of the 4 control points (yc).

I've been trying (in vain so far) to set up the curve fitting as a least squares problem. Part of the problem is the difficulty of finding the right value of t for a given x of a data point being fitted, since x is cubic in t. One approach is to solve the cubic equation, but I've found that Newton-Raphson works quite well.

It turns out that solving for the y-values of the 4 control points (yc) is a straightforward linear least squares problem because the equations are linear in yc. Thus, for an initial guess of the x-values of the control points (xc) the values of t for each data point can be determined (e.g. by Newton-Raphson), and in a single least squares calculation, the y-values of the control points (yc) can be determined to minimize the square error in vertical distance between the data points and the calculated curve.

The problem is how to get updated x-values of the control points (xc). :frown:

I've searched the internet extensively and though have found references to least squares solutions to Bezier curves, I've never found enough detail to figure out how to do it. :confused:

Any/all hints/tips would be appreciated. Perhaps an approach other than least squares is better. Thanks!
 
Last edited:
Mathematics news on Phys.org
Last edited by a moderator:
That link you posted doesn't exist anymore. Does anyone know where to find that code now?
 
Hello:

I'm not sure if you still have interest in this. I was looking into this problem and found this thread. Now that I've made up my mind on the subject, I thought I would share it here.

As stated, the problem is to find the t_i values for the input data points, so that the LS formula can be used to compute the control points.

Two alternatives for initial guesses:

1. Distribute them uniformly in [0,1] according to the number of data
points N: 0, 1/N, 2/N, ...
2. Use chord-length parameterization, which seems to involve a polynomial fit
up to point ith, then measuring the arc length
See Koichi Itoh: "A curve fitting algorithm for character fonts"
ELECTRONIC PUBLISHING, VOL. 6(3), 195–205 (SEPTEMBER 1993)

Now on how to refine the t_i values.

In Mr Itoh's paper, the suggested strategy for updating t_i is:
solve the 5th degree equation in t_i (e.g., with Newton-Raphson):
[B(t)-P_i]B'(t) = 0
with B'(t) the tangent to the Bezier B() at point t
Then the new t_i values are used to generate the LS control points. Iterate

Or, an alternative approach to obtain refined control points can be used.
This deviates somewhat from the original problem, but it results in a good fit nevertheless.

One can treat the total square error as a function of the control points, and use the control points from the initial guess (obtained with either 1. or 2. above) to start a numerical gradient descent. There exists more than one choice of gradient descent.
For example, a signed stochastic gradient approximation iteration reads:

- Let the current control points be C = (x0, x1, x2, x3, y0, y1, y2, y3)
- Perturb them, e.g., with a single random perturbation vector we generate diametrical
candidate vectors:

Cp = C + mu*delta
Cm = C - mu*delta
where mu > 0 is a step size parameter
- Measure the square error for Cp and for Cm
For each data point:
* Compute t_i, e.g. with the cubic root or with Newton-Raphson.
This can be done with any coordinate of the data point
* Find the values B(t_i) and compute the total square error
- Update the control points:
If Cp gives smaller square error, then do C = Cp and iterate
Otherwise do C = Cm and iterate
The iterations go until maximum of iterations is reached or the
square error changes less than a prescribed threshold

You can also use more than two perturbations, e.g., generate 8 random perturbation
vectors and do the plus-minus perturbations for each
Then you need to evaluate the cost function 16 times, but the convergence
may take less iterations

Alternative stochastic methods in the same spirit include the
"random walk with direction exploitation" method, see e.g.,
S. Rao, Optimization: Theory and Applications

I've been using the sotchastic signed gradient method with good results.
The only problem was that I had to supply the starting point manually.
From now on, I will use the control points from the
LS estimate with uniform allocation as starting point

One last thing. I believe that the aforementioned dead link refers to a file in the
Matlab Central: "cubic Bezier least square fitting"

Thanks for the original post and good luck!



Eduardo
 
  • Like
Likes hunt_mat
It took a long time to figure out how to approach it, but I was able to solve the problem (with brief guidance from a statistics professor friend). Let's say that y = f(a,b,t) and x = g(c,d,t) and the data points are (xi,yi). The least squares problem can be stated as the following constrained problem:

min \sum(f-y_i)^2 s.t. g-x_i=0

Using the langrange multiplier method (one constraint for each data point) the math is straightforward though tedious and results in a set of decoupled equations for the unknown coefficients (a,b,c,d using the above notation) and the ti. Also, since the ti are unknown, a nonlinear approach is needed (i.e. linearizing the equations before taking partial derivatives and setting them equal to zero). The lagrange multipliers conveniently fall out and never have to be computed.

I've since convinced myself that orthogonal least squares makes more sense when using parametric equations since the curves can wrap (e.g. general conic like a rotated ellipse).
 
Last edited:
Thread 'Video on imaginary numbers and some queries'
Hi, I was watching the following video. I found some points confusing. Could you please help me to understand the gaps? Thanks, in advance! Question 1: Around 4:22, the video says the following. So for those mathematicians, negative numbers didn't exist. You could subtract, that is find the difference between two positive quantities, but you couldn't have a negative answer or negative coefficients. Mathematicians were so averse to negative numbers that there was no single quadratic...
Insights auto threads is broken atm, so I'm manually creating these for new Insight articles. In Dirac’s Principles of Quantum Mechanics published in 1930 he introduced a “convenient notation” he referred to as a “delta function” which he treated as a continuum analog to the discrete Kronecker delta. The Kronecker delta is simply the indexed components of the identity operator in matrix algebra Source: https://www.physicsforums.com/insights/what-exactly-is-diracs-delta-function/ by...
Thread 'Unit Circle Double Angle Derivations'
Here I made a terrible mistake of assuming this to be an equilateral triangle and set 2sinx=1 => x=pi/6. Although this did derive the double angle formulas it also led into a terrible mess trying to find all the combinations of sides. I must have been tired and just assumed 6x=180 and 2sinx=1. By that time, I was so mindset that I nearly scolded a person for even saying 90-x. I wonder if this is a case of biased observation that seeks to dis credit me like Jesus of Nazareth since in reality...
Back
Top