Hessian of least squares estimate behaving strangely

Jeffack · Aug 6, 2016

I am doing a nonlinear least squares estimation on a function of 14 variables (meaning that, to estimate ##y=f(x)##, I minimize ##\Sigma_i(y_i-(\hat x_i))^2## ). I do this using the quasi-Newton algorithm in MATLAB. This also gives the Hessian (matrix of second derivatives) at the minimizing point. My point estimates all seem reasonable, but the Hessian does not:

Every value in row ##5## and in column ##5## is zero, except for the entry at (##5, 5##), which is 1. Several of the other entries are also zero.

To find standard errors, you invert the Hessian and take the square root of the diagonals. When I do this, all of the estimates are near 1, and 4 of them are exactly 1.

I went and looked back at the function, and I couldn't see anything blatantly wrong. When I change the value of the 5th parameter, the value of the function changes (as it should); that is pretty much the end of my troubleshooting ability. I don't think the function is badly scaled; all of the parameters are between -2 and 4.

The last thing I should mention is that I had MATLAB calculate the eigenvalues of the Hessian. The first 13 of them were approximately zero, and the last one was 170,000.

Any idea what's going on here? I've calculated the Hessians for very similar functions and not had this issue.

Stephen Tashi · Aug 6, 2016

Jeffack said:

I am doing a nonlinear least squares estimation on a function of 14 variables (meaning that, to estimate ##y=f(x)##, I minimize ##\Sigma_i(y_i-(\hat x_i))^2## ).

Did you mean that you minimize ##\Sigma_i(y_i - \hat y_i)^2## ?

I don't understand whether "##y##" and "##x##" denote vectors or scalars.

If you have a scalar ##y## that is a function of 14 scalar variables ##x_1, x_2, ...x_{14} ## then are you minimizing ##\Sigma_j (y_j - f(x_{1,j}, x_{2,j}, x_{3,j},...x_{14,j} ))^2 ## where ##j## is the index for the ##j##th sample ?

Jeffack · Aug 6, 2016

Sorry about that. You are correct: ##y## is a n-by-1 vector, where n is the number of observations. ##x_1, x_2,...x_{14} ## are all n-by-1 vectors. ##y_i## is the ##i##th element of vector ##y##. ##x_{n,i}## is the ##i##th element of vector ##x_n##. The problem should have been written as you wrote it:

##\Sigma_j (y_j - f(x_{1,j}, x_{2,j}, x_{3,j},...x_{14,j} ))^2##

Stephen Tashi · Aug 6, 2016

Without knowing some specifics, I can only suggest that you see what happens if you change some of the data you are using.

What is the function you are fitting to the data ?

Perhaps a MATLAB user can tell you what's going on if you show some of the MATLAB code.

h6ss · Aug 7, 2016

Jeffack said:

I am doing a nonlinear least squares estimation on a function of 14 variables (meaning that, to estimate ##y=f(x)##, I minimize ##\Sigma_i(y_i-(\hat x_i))^2## ). I do this using the quasi-Newton algorithm in MATLAB. This also gives the Hessian (matrix of second derivatives) at the minimizing point. My point estimates all seem reasonable, but the Hessian does not:

Every value in row ##5## and in column ##5## is zero, except for the entry at (##5, 5##), which is 1. Several of the other entries are also zero.

To find standard errors, you invert the Hessian and take the square root of the diagonals. When I do this, all of the estimates are near 1, and 4 of them are exactly 1.

I went and looked back at the function, and I couldn't see anything blatantly wrong. When I change the value of the 5th parameter, the value of the function changes (as it should); that is pretty much the end of my troubleshooting ability. I don't think the function is badly scaled; all of the parameters are between -2 and 4.

The last thing I should mention is that I had MATLAB calculate the eigenvalues of the Hessian. The first 13 of them were approximately zero, and the last one was 170,000.

Any idea what's going on here? I've calculated the Hessians for very similar functions and not had this issue.

I believe Levenberg-Marquardt is more robust than quasi-Newton methods for computational nonlinear regression, and generally works better for ill-conditioned problems. Also, have you tried different methods for calculating the eigenvalues? Or even, for inverting your Hessian matrix? That 170,000 sounds like a very poor approximation of the inverse of the Hessian matrix, which could be caused by an ill-conditioned starting function, e.g. say your initial function is flat, then the calculations of the derivatives might be flawed and thus, further iterations done by the approximation of the inverse or even the eigenvalues might be flawed. You have to make sure you meet the (usually strict under experimental use) conditions variable metric algorithms require.

I would be curious to see the MATLAB code, however that's a problem I would definitely suggest solving in R!

Hessian of least squares estimate behaving strangely

1. Why is the Hessian matrix important in least squares estimation?

2. How does the Hessian matrix affect the performance of the least squares estimate?

3. What are some signs that the Hessian of the least squares estimate is behaving strangely?

4. How can we address issues with the Hessian in least squares estimation?

5. Can the Hessian of the least squares estimate be ignored?

Similar threads

Hot Threads

Recent Insights