# Empirical data

1. Sep 18, 2009

### a.mlw.walker

So I have studied some of the Numberical methods such as Newton Raphson etc, but I want to try and solve this particular problem using the secant method. The problem is something I am reading in an article, and I am trying to follow the maths through.

I am slightly wondering though how it is done.

Equation 1 (attached) is the equation I start with. The data sets I have are 50 values for t(k.2$$\Pi$$) I also know C0, but I am trying to find estimates for a and b.

Please can someone explain how this may be done.

The article explains fitting the equation to a curve, and states that a, b and C0 in equation 2 (attached) appear non-linearly

It then goes on to say, that once the curve has been fitted after minimizing equation 2 becomes equation 3 (attached).

Does anyone understand what this means, and what I would need to do to solve equation 1 with the secant method for a and b.

Thank you

#### Attached Files:

File size:
67.1 KB
Views:
148
File size:
104.3 KB
Views:
139
• ###### equation3.bmp
File size:
116.8 KB
Views:
125
2. Sep 21, 2009

### hotvette

I can explain equation 2. If I have a series of data points (x i, y i) that I want to fit with a function y = f(x,a,b) in a least squares sense, the task is to minimize the sum of squares of the the vertical distance between each data point and the curve (ie, y - f). Thus, I'm trying to minimize:

$$S(a,b) = \sum \big( y_i - f(x_i,a,b) \big)^2$$

I have no idea on equation 3. Question - can you post the article or the relevant section?

3. Sep 21, 2009

### hotvette

If you want to know the steps for solving nonlinear least squares problems, it is summarized below:

1. Linearize the function $f(x_i,a,b)$ at intial guesses for $a = a_0$ and $b = b_0$ using the first two terms of the Taylor series:

$$f \approx f_0 + f_a \delta a + f_b \delta b$$

where $f_a = \partial f / \partial a$ and $f_b = \partial f / \partial b$

2. Form the least squares problem using the linearized function:

$$F =\sum ( f_0 + f_a \delta a + f_b \delta b - y_i )^2$$

3. To minimize F, set partial derivatives of F with respect to $\delta a$ and $\delta b$ equal to zero:

\begin{align*} F_{\delta a} &= 0 = 2 \sum ( f_0 + f_a \delta a + f_b \delta b - y_i ) f_a \\ F_{\delta b} &= 0 = 2 \sum ( f_0 + f_a \delta a + f_b \delta b - y_i ) f_b \end{align*}

The next result is the following normal equation for nonlinear least squares analysis:

$$J^T J z = J^T (y - f)$$

where $z = \begin{bmatrix} \delta a & \delta b \end{bmatrix}^T$ and J is the Jacobian of first partial derivatives of f.

4. Solve the linear equation from step 3 for z and update guesses for a & b:

\begin{align*} a_1 &= a_0 + \delta a \\ b_1 &= b_0 + \delta b \end{bmatrix}

5. Upate J and f based on updated values for a and b

6. Repeat steps 4 & 5 until $\delta a$ and $\delta b$ are sufficiently small.

Last edited: Sep 21, 2009
4. Sep 21, 2009

### hotvette

The tie to Levenberg-Marquardt is as follows. The (square) matrix $J^T J$ can often be ill-conditioned at intial guesses of the function coefficients making the equation $J^T J z = J^T (y-f)$ difficult if not impossible to solve. The (simpler) version of Levenberg-Marquardt modifies the problem by adding a small number $\lambda$ to the diagonals of $J^T J$, resulting in the following modified equation:

$$(J^T J + \lambda I) z = J^T (y-f)$$

which is more numerically stable the larger $\lambda$ is. The trick is to make $\lambda$ as small as possible to solve the problem, then gradually reduce it's magnitude to zero as the solution progresses.

One other note. The matrix $J^T J$ doesn't need to be computed at all if QR matrix factorization is used. Then, all that's needed is to factor $J = QR$ then solve the triangular system $R z = Q^T (y-f)$. See link below for more explanation.

http://www.alkires.com/teaching/ee103/Rec8_LLSAndQRFactorization.htm

The next question might be how to add $\lambda$ to the diagonals of $J^T J$ if $J^T J$ is never computed? Actually, it is done by augmenting J and (y-f) as follows:

$$J^* = \begin{bmatrix} J \\ \sqrt{\lambda} I \end{bmatrix} \qquad \qquad (y-f)^* = \begin{bmatrix} (y-f) \\ 0 \end{bmatrix}$$

Then factor $J^* = QR$ and solve $R z = Q^T (y-f)^*$.

This is probably more than you wanted to know........

Last edited: Sep 21, 2009
5. Sep 22, 2009

### a.mlw.walker

So in laymans terms, can you see how I can use the Levenberg-Marquardt to solve equation 1 to find approximations of a and b, if i have lots of data for Tk?

6. Sep 22, 2009

### hotvette

Yes, but I presume you mean ordered pairs $\big( k_i, t(k_i) \big)$

7. Sep 22, 2009

### a.mlw.walker

well if tk = equation 1, i want to solve equation 2 for a and b, is that the same?

8. Sep 23, 2009

### hotvette

Sort of. Need to be careful of terminology: tk is an experimentally determined data point. Equation 1 is the idealized function that is suppose to represent the data, but doesn't precisely do so because of experimental error associated with the data points. The method to estimate the parameters a & b is least squares. Equation 2 is the least squares problem to solve based on the desired fitting function (i.e. equation 1).

Thus, if you have a bunch of experimentally determined data points tk, estimates of the parameters a & b can be found by minimizing equation 2. I hope this helps explain the situation.

One thing that seems odd is the use of k in equation 2. The index k is meant to represent the index of data points, but k is also used within the fitting function itself (i.e. equation 1). Either that's a typo or the "x" values associated with the data points is really 0,1,2...n.

9. Sep 23, 2009

### a.mlw.walker

OK, will check the k issue, thanks for that. So what you are saying is once you have the data points for tk, i can use the least squares method to solve for a and b for non linear estimates.

So it may just be syntax, but the S(a,b,co) confuses me a bit, does that mean I am solving for a,b and co? Does normal least squares method work for finding estimates of two unknown variables?

10. Sep 24, 2009

### hotvette

Correct. Least squares can be used to find estimates for a and b. But, since Equation 1 is nonlinear in a & b, the nonlinear least squares algorithm needs to be used, which means an iterative solution process based on initial guesses for a & b.

Hard to say w/o seeing the article, but it may be subtle terminology. Out of context I would read equation 2 as involving three unknowns. But, you said in an earlier post that you already knew c0. If that's true, I would write equation 2 as S(a,b) = ......

Least squares can be used for finding estimates for as many unknowns as you have (and the computational resourses to calculate). I've heard of problems with thousands of unknowns or more.

Last edited: Sep 24, 2009
11. Sep 25, 2009

### a.mlw.walker

OK, so will i end up with a couple of simultaneous equations to get each approximaiton for a and b?

12. Sep 25, 2009

### hotvette

Correct, but it is an iterative process because the problem is nonlinear. You start with initial guesses for a & b. Then solve the least squares problem for $\delta a$ and $\delta b$ based on a linearized version of equation 1, update the values of a & b, and repeat until you have converged values for a & b. That's the way nonlinear least squares works.

I think you are at a disadvantage because you are jumping into a nonlinear least squares problem before understanding the basics of linear least squares.

13. Sep 29, 2009

### a.mlw.walker

Hi again, sorry long delay.

Have been looking over what i have on least squares curve fitting, and i was wondering how to tell to what order my curfe fit will have to be to give an acurate representation of the data, the article i got it from used that Levenberg-Marquardt method, did that mean that they didnt need to worry about the order, it just found a good result for a and b and c0? My notes on least squares say that the final equation representing the data will be a quadratic?

14. Oct 1, 2009

### hotvette

This is confusing. Pls explain what you mean by "order" and "quadratic". It almost sounds like you are talking about polynomial fitting functions.

Can't you post relevent pages of the article? You are asking how to interpret an article we can't see.

15. Oct 2, 2009

### a.mlw.walker

Sorry I was confusing both of us, when I tried to math your post you wrote at Sep21-09, 02:56 PM and my notes on least squares. I will have to find a good tutorial on this Levenberg Marquardt thing now. I understand that it is doing linear regression then Newton-Gauss method, to minimise, but not too sure about the Newton-Gauss bit at mo.

16. Oct 2, 2009

### a.mlw.walker

Wow, I never saw that tutorial at the bottom of all your posts:

3. Linear Least Squares, Non-Linear Least Squares, Optimization