Approximation Theory Help: Parameter Estimation & Fitting Curve

a.mlw.walker · Jul 15, 2011

Hi
Is there anyone here who has a good understanding of using approximation theory and parameter estimation techniques who can help me understand a chapter of a paper, and why certain techniques have been used. I have tried to use the given data, and matlabs optimization tools to follow the chapter through, but I cannot get a good fitting curve (at all). the method requires Levenberg marquardt to approximate some parameters, golden section search to improve them, and then a trisection search (which I think is golden section search?) to find more parameters based on the first ones found.

Thanks

clarkwgriswol · Jul 18, 2011

Could you post a reference to the article? Not sure if I could help but I'd be happy to take a look.

a.mlw.walker · Jul 19, 2011

Great thanks. I have the relevant extract of the paper, which i have attached. I basically want to use levenberg marquardt to solve equation 40 for a,b and c0. The value Tk is:

I have timings for tk: 1.994, 1.806, 1.632, 1.493, 1.2
x tk
1 1.2
2 1.493
3 1.632
4 1.806
5 1.994
Its an oscilating system that slows with time.
Equation 41 apparently is to refine the parameters further, however golden section searc i thought finds a minimum, however i am not sure what minimum, because plotting those points above gives a constant downward gradient.
Equation 42 requires the approximations from 41.

I can't find a way (that i can get to work) to approximate those values. any ideas?

(P.S. I can't upload twice so the image is my attachment about the same question here: https://www.physicsforums.com/showthread.php?t=513060)

hotvette · Jul 19, 2011

I found a link to a pdf of the actual paper, which should make the discussion a bit easier.

http://www.roulette.gmxhome.de/roulette[1].pdf

Equation 40 is a least squares formulation based on equation 35 (which is what I thought a couple of years ago when this topic was first introduced). Based on your 5 timings I did a quick and dirty curve fit using downhill simplex. Not sure I did it right, but both a and c0 are negative and the fit doesn't appear to be very good. The data look rather linear. Is this the same result you got with Matlab?

a.mlw.walker · Jul 20, 2011

Yeah for the data points, I used the MATLAB curve fitting tool, but I couldn't get it to do a levenberg marquardt fit. It couldn't converge. Did you use my values for x, I think they should start at zero not 1 thought, but I haven't tried that yet. Is yours a levenberg marquardt fit?
Equation 41 in the paper is supposed to refine the parameters, did you read that bit?

On that website, under manual, you can see his values for a and c0 can be negative. that's not a problem i don't think.

I used MATLAB though to try and do a downhill simplex, but it still can't converge. how did you do that?
The equation i am fitting to is:
Fitted_Curve = (1/(a*b))*(c-asinh(sinh(c)*exp(a*Input*2*pi)));

but the graph after trying to solve is:

a.mlw.walker · Jul 20, 2011

Actually I have managed to get it to plot it. But like you, not a very good fit...

D H · Jul 20, 2011

a.mlw.walker said:

The equation i am fitting to is:
Fitted_Curve = (1/(a*b))*(c-asinh(sinh(c)*exp(a*Input*2*pi)));

You got a bad fit because you fit the wrong parameter. You should be fitting a, b, and c₀ by minimizing equation 40,

[tex]S(a,b,c_0) = \sum_{k=0}^N \left\{t_k - \frac 1{ab} \left(c_0-\sinh^{-1}(\sinh c_0\cdot e^{ak\pi})\right)\right\}^2[/tex]

or you should be fitting a by minimizing equation 41.

a.mlw.walker · Jul 20, 2011

what is different? are you saying that you either do 40 or 41, not both?

D H · Jul 20, 2011

I am saying that the one thing you shouldn't do is to do the fit from post #5 (as quoted in post #7).

From the paper, it looks like the author first did a multivariate fit in equation #40 on parameters a and b and then refined that estimate by doing a fit on a only via equation #41. In other words, equation #40 when minimized gives the final value for b but only an initial guess for a. The final value for a comes from minimizing equation #41.

Note well: That `with' section after equation #41 applies to both equations #40 and #41. In particular, note that c₀ is not a tuning parameter. It is instead an ugly expression that is a function of the tuning parameters a and b and of the time difference t₁-t₀.

a.mlw.walker · Jul 20, 2011

I tried that, but I can't get it to produce a good fit (compared to the above graphs) at all. It is closer estimating c0.
From your earlier post
"The equation i am fitting to is:
Fitted_Curve = (1/(a*b))*(c-asinh(sinh(c)*exp(a*Input*2*pi)));"
What I am fitting is actually the difference between this and the actual data.

I understand to freeze b in equation 41 though, however I can't get the golden section search to produce a better estimate, however I am still approximating for c0 at this stage.

D H · Jul 20, 2011

You shouldn't be estimating c0. It is merely a stand-in for

[tex]c_0 = -\coth^{-1}\left(\frac{e^{a2\pi}-\cosh(ab(t_1-t_0))}{\sinh(ab(t_1-t_0))}\right)[/tex]

a.mlw.walker · Jul 20, 2011

yeah, i can't get that to converge. by the way is t1 - t0 the values up there or is it the difference between those times? i,e tk(1) - tk(2)

hotvette · Jul 20, 2011

D H said:

You shouldn't be estimating c0. It is merely a stand-in for

[tex]c_0 = -\coth^{-1}\left(\frac{e^{a2\pi}-\cosh(ab(t_1-t_0))}{\sinh(ab(t_1-t_0))}\right)[/tex]

Hmmm, if that's the case, the author's description of equation 40 is misleading. He indicates the function to minimize is a function of a,b,c₀ implying that all three need to be estimated. He should have said the function to minimize is a function of a,b where c₀=xxxxx. Not well written.

a.mlw.walker · Jul 20, 2011

hotvette please could you try it, i cent get it work

hotvette · Jul 20, 2011

a.mlw.walker said:

yeah, i can't get that to converge. by the way is t1 - t0 the values up there or is it the difference between those times? i,e tk(1) - tk(2)

Yeah, I was wondering that also. If the author uses consistent notation, t₀ should be t @ k=0, which is zero according to equation 35. Kind of confusing.

hotvette · Jul 20, 2011

a.mlw.walker said:

hotvette please could you try it, i cent get it work

If you mean a two parameter estimation using post #11 as the definition of c₀, sure, but I don't think the result will be much better. The data points are close to linear. I doubt they will fit an exponential decay function very well no matter how much the parameters are manipulated. In fact, the fit should be worse because the value of c₀ will be restricted, but I'll give it a try (later).

a.mlw.walker · Jul 20, 2011

What do you reckon about the original data being cumulative so:
x tk
1 1.2
2 2.695
3 4.325
etc etc. that way t1-t0 would be that time, but fit the curve to the cumulative times?
Edit! yeah, cumulative fits better, much better, and I am using the correct c0. however i am not using the individual timings but the cumulative time for T0, which doesn't seem correct...

Found out that T0 is the time for the initial time value, and is therefore a constant

hotvette · Jul 20, 2011

a.mlw.walker said:

What do you reckon about the original data being cumulative so:
x tk
1 1.2
2 2.695
3 4.325
etc etc. that way t1-t0 would be that time, but fit the curve to the cumulative times?
Edit! yeah, cumulative fits better, much better, and I am using the correct c0. however i am not using the individual timings but the cumulative time for T0, which doesn't seem correct...

Found out that T0 is the time for the initial time value, and is therefore a constant

Yep, looks better using cumulative. Attached is three parameter fit. However, what do you mean by "Found out that T0 is the time for the initial time value, and is therefore a constant"? Pls elaborate.

a.mlw.walker · Jul 20, 2011

Yeah I got better too, but mine curves the other way...(attached). That is using D H's idea that c0 is not a parameter in itself.
The website that you said you found the whole document at says that T0 is the time of the initial revolution, on the page called manual at the bottom somewhere...

What have you plotted on the y axis? I plotted the cumulative time?

hotvette · Jul 20, 2011

a.mlw.walker said:

Yeah I got better too, but mine curves the other way...(attached). That is using D H's idea that c0 is not a parameter in itself.
The website that you said you found the whole document at says that T0 is the time of the initial revolution, on the page called manual at the bottom somewhere...

What have you plotted on the y axis? I plotted the cumulative time?

I made goof and just now edited my last post with the correct version. Almost a perfect fit (looks identical to yours), though I still used 3 parameters. Looks like you are on the right track.

D H · Jul 20, 2011

hotvette said:

Hmmm, if that's the case, the author's description of equation 40 is misleading. He indicates the function to minimize is a function of a,b,c₀ implying that all three need to be estimated. He should have said the function to minimize is a function of a,b where c₀=xxxxx. Not well written.

I'm not going to argue with you on that one! Look at my post #7: I initially thought that equation #40 was a minimization with respect to those three parameters myself. It took a couple of readings to see that this is not the case.

hotvette · Jul 20, 2011

D H said:

I'm not going to argue with you on that one! Look at my post #7: I initially thought that equation #40 was a minimization with respect to those three parameters myself. It took a couple of readings to see that this is not the case.

Looks like our collective understanding is converging. I tried a two parameter fit and the result is nearly identical to a three parameter fit. Almost looks too good.

What still bothers me is the expression T₀ = t₁ - t₀ because it seems to me t₀ has to equal zero (based on equation #35).

D H · Jul 20, 2011

a.mlw.walker:

Unless you object, I'd like to move this back to mathematics, where you originally asked about this topic.

You didn't get any bites the first time around because you put the question in "General Math" and you gave the thread a bad title ("please can people advise me on this!"). Threads in general math whose titles are of the form "please help me", are entirely in lower case, and ends with "!" are typically from students asking us to do their homework for them while they go toss down a beer. So, other than perhaps the mentor responsible for that section, nobody even looked at your post.

This time around you got bites despite having put the thread in the wrong place (this is not a MechE question) because you gave the thread a good title and kept the original post short and to the point.

D H · Jul 20, 2011

Now back to the question at hand:

I have noticed on occasion that a non-linear multivariate fit doesn't seem to fit as well as I'd like and it seems to leave some signal in the residuals. Polishing that initial fit is with a second fit on a restricted set of variables oftentimes does the trick (e.g. as was done in this paper) . That polishing is admittedly a bit ad hoc. If that adhocery doesn't work I either resort to something even more ad hoc or I back up and try again with a drastically different technique.

a.mlw.walker · Jul 21, 2011

Cool, we are getting to an aggreement. Hotvette got the two and three parameter fits to agree. Hotvette, on the website you found the actual document on, there is a link to a page called manual.
Search for this line
t0: time of initial revolution
in the chapter called 3. Set Ball.
That is why i think T0 is just the time for the initial timing.

D H
If you would like to move it, do so by all means. Apart from it being specific to parameter estimation, I think the computing side of this problem could allow it to be considered as engineering, however i am not bothered. I don't even know if you have already moved it...

Your point on second fits. How do you know when a fit is good enough to not need a second fit? He does one yes, however the curve looks so good that is there some way to determine whether a second fit is necessary?

Once we do/dont do a second fit, the next problem is equation 42. Can we talk about what he does here. He uses a and b and some approximations for theta_f to optimize for the three parameters.
However he says he can solve it for the one nonlinear parameter first, and the linear parameters linearly.
I have read about this, but I am not sure how the method changes.
What is the method of trisection here? Google can't find much on it, but i suspect its similar to the golden section search as in it finds a minimum as you vary the non linear parameter of 2pi there will be a minimum error to try and find...

Out of interest if you have linear and non linear parameters, can you solve for all of them non linearly and still get the correct linear parameters or does it have to be done the way he mentions - that's only a side note, just wondering...

then the linear parameters can be found by any old meansin the linear equation 43.

a.mlw.walker · Jul 21, 2011

Guys, I have tried to use the golden section search for equation 41. It runs successfully but doesn't seem to produce a 'better' fit. What do you reckon. I used beta as ab^2 like he said, and c is a constant from the solution of the first part (attached)

a.mlw.walker · Jul 26, 2011

hey guys, you gone on holiday?

hotvette · Jul 26, 2011

a.mlw.walker said:

hey guys, you gone on holiday?

In my case it was because I wasn't sure there was anything more I could add that would be useful. I reviewed the paper one more time and am convinced of the following (others may disagree).

1. Step 1: equation #40 is intended to obtain initial estimates of all three parameters (not two) based on equation #35. Subsequent discussion refers to further refinement steps and has nothing to do with equations #35 and #40.

2. Step 2: parameter refinement for a & b using equation #41 (and definition for c₀) isn't intended to get a better fit for equation #35. It is meant to get better estimates of a & b for use in step 3.

3. Step 3: using the refined esimtates for a & b from step 2, obtain estimates for the remaining parameters using equation #43

I believe the intention is to discard the previous equations at each new step. Re why the author chose the particular optimization method for each step, it is difficult to say. Perhaps the author tried several methods each time and found one that seemed to work better in each case.

I seem to recall you asking a question about a curve fitting situation where some parameters are linear and others are non-linear. As far as I know, even if only 1 parameter is non-linear they all need to be treated as non-linear. Even if a least squares problem is linear in all unknowns, it can still be approached as a non-linear problem (and solved in a single iteration).

Hope this helps. I really don't think there is anything more of use I can add.

a.mlw.walker · Jul 27, 2011

Great thanks Hotvette, however after equation 42, before 43, the author writes that using the method of trisection the value for phi can be found and then the equation becomes a linear equation (43). How would you find the value of phi without also finding the other parameters - or is this what you are saying. solve it completely non linearly and then using the value of phi improve the other approxiomations linearly?
Did you see my graph above, trying to improve the parameter a, gave a worse fit?

hotvette · Jul 27, 2011

a.mlw.walker said:

Great thanks Hotvette, however after equation 42, before 43, the author writes that using the method of trisection the value for phi can be found and then the equation becomes a linear equation (43). How would you find the value of phi without also finding the other parameters - or is this what you are saying. solve it completely non linearly and then using the value of phi improve the other approxiomations linearly?
Did you see my graph above, trying to improve the parameter a, gave a worse fit?

It's difficult to unambiguously interpret the paper . I think it's clear that equation #40 is a 3-parameter problem. It clearly says S(a,b,c₀) = xxx and all three are mentioned in the next sentence.

Beyond that it's a bit fuzzy. Equation #41 looks like a single parameter problem (i.e. S(a) = xxx) to refine the value of 'a', thus the use of golden section or trisection. Beta is fixed using the values of a & b that were determined from equation #40. What isn't clear is whether c₀ is considered fixed using the same values of a & b that were used for Beta or whether 'a' is still considered a variable. Once you solve equation #41, forget about #40 and #35, they no longer apply (that was my point in the previous post).

What's the ultimate goal of this? Predict where the ball will land?

a.mlw.walker · Jul 27, 2011

I suppose that's the ultimate goal. I am applying for a job in finance and have been advised that I have a very good grasp of estimation theory. After hinting the interenet for more complex examples and I came across this years ago so thought that I would try and solve them. My background is mechanical engineering, but usually when i need to fit a curve its to a polynomial not to an equation like in this paper.

I have read a little more and emailed the author and have found out that eqn 41 is used to quantify the goodness of the fit. I.e if a changes much then the fit is not very good. D H mentioned that this technique is an 'ad hoc' techinque. As the original fit from equation 40 is better, I will use other methods to describe the goodness of the fit.

Just looking at equation 40, the mimization method usually is the sum of the difference between real data and theoretical data (squared). however in equation 40 i can't tell which part is the real data part, can you see what i mean?

EDIT: Oh right, eqn 41 is not the sum squared, it is the modulus of the sum, why has the author written this differently?

D H · Jul 27, 2011

hotvette said:

1. Step 1: equation #40 is intended to obtain initial estimates of all three parameters (not two) based on equation #35. Subsequent discussion refers to further refinement steps and has nothing to do with equations #35 and #40.

One problem with this: The c₀ in equation #35 is a function of b and c₁, the latter of which is given by equation #25 or #26.

That said, it does appear after multiple readings of the text around equation #40 that equation #40 is a fit for three parameters. The factor c₀ found with this fit is apparently tossed out.

2. Step 2: parameter refinement for a & b using equation #41 (and definition for c₀) isn't intended to get a better fit for equation #35. It is meant to get better estimates of a & b for use in step 3.

Equation #41 is a fit for one parameter, a, not two. Whether the value for b is recomputed from the "frozen" value for beta isn't at all clear. That he said "frozen" does suggest that b does need to be recomputed here.

3. Step 3: using the refined esimtates for a & b from step 2, obtain estimates for the remaining parameters using equation #43

I agree with that interpretation.

Re why the author chose the particular optimization method for each step, it is difficult to say. Perhaps the author tried several methods each time and found one that seemed to work better in each case.

Or perhaps he found one that worked better in the one case he had on hand. One thing is certain: This approach wreaks of ad-hocery. Why not fit for all parameters at once? And trisection? Seriously? That is one of the worst optimization techniques around.

This is perhaps a bit disparaging, but it appears that the author knows a limited number of optimization techniques. To overcome the limitations of those techniques he used a lot of ad hoc, ummm, stuff. There's no mention in the paper of the ruggedness of the optimization landscape or of any correlations between his chosen tuning parameters. I suspect there's a lot of nastiness going on such as correlated coefficients and a rugged landscape with a long curving valley. Perhaps another technique would fare better. Simulated annealing or a biologically-motivated technique such as ant search might well attack all of the parameters at once.

One more point: The author obviously had a lot more than five data points (noisy data points at that) at hand. Noisy data does tend to make for a rugged landscape.

a.mlw.walker · Jul 28, 2011

Hi, Can I just ask you what you think about the setup of equation 42. Usually with least squares methods you have the actual data - the computed answer, square it, sum it and find the minimum. What part of equation 42 is the actual data part. (apart from theta_f). I am a little conusfed as to how to set the MATLAB up for this.
At the moment I am taking:
f = sum(abs(c1.*exp(-2*a*theta_f)+nu*((1+0.5*(4*a^2+1))*cos(theta_f+phi)-2*a*sin(theta_f+phi))+b^2-wf2))

That is the sum of the modulus of everything in the equation. I think this is what i am trying to minimize?
I took your advice and decided i would try and solve all the parameters in 42 in one go. I am again using the golden section search. However one question is from the fact that my wf^2 comes out the same as my 'nu' - that is if the upper and lower boundaries start as the same. I just wondered how sensitive this method was to the start values, because changing the start values has a significant effect on the values it calculates

hotvette · Jul 28, 2011

I don't know where equation #42 came from but it isn't a least squares problem. It's just some function to be minimized with respect to several variables, meaning the partial derivatives of the function with respect to each variable need to be zero.

a.mlw.walker · Jul 29, 2011

Ok so for this one what method would you suggest. I know he does a trisection search (i have never used trisection, D H said it wasnt good, so I attempted using golden section search).
However I thjink I am running into what he is talking about. He said he solves for phi between 0 and 2pi using trisection then solved equation 43 using other methods ( i think by hand?). My values for nu and omega^2 always come out the same using this method, so I think this is why he did it his way, However if i want to get all parameters at once what method would you recommend. Can i use the downhill simplex for this, and just set it up as the modulus and the sum rather than squaring it?
D H mentioned an ant search which i have been reading about, however I don't think I understand what the algorithm is doing. A nice robust method like that though did sound appealing.

His data he uses for equation 42 is in the attached graph. its pretty dirty...
x axis is T0 (initial time) y-axis is theta_f (fall angle)

Approximation Theory Help: Parameter Estimation & Fitting Curve

Attachments

Attachments

Attachments

Attachments

Attachments

Attachments

Attachments

1. What is approximation theory?

2. How is parameter estimation related to approximation theory?

3. What is curve fitting and how does it relate to approximation theory?

4. What are the different methods used in approximation theory for parameter estimation and curve fitting?

5. How is approximation theory used in real-world applications?

Similar threads

Hot Threads

Recent Insights