Hessian of least squares estimate behaving strangely

Click For Summary
SUMMARY

The discussion centers on issues encountered during nonlinear least squares estimation using the quasi-Newton algorithm in MATLAB. The Hessian matrix derived from the optimization shows abnormal behavior, with multiple zero entries and eigenvalues indicating potential ill-conditioning. Users suggest exploring the Levenberg-Marquardt algorithm as a more robust alternative and recommend verifying the initial function's conditions to ensure accurate derivative calculations. The conversation emphasizes the importance of proper scaling and the potential need for alternative methods in eigenvalue calculation and Hessian inversion.

PREREQUISITES
  • Understanding of nonlinear least squares estimation
  • Familiarity with MATLAB and its optimization toolbox
  • Knowledge of Hessian matrices and their role in parameter estimation
  • Basic concepts of eigenvalues and matrix inversion
NEXT STEPS
  • Research the Levenberg-Marquardt algorithm for nonlinear regression in MATLAB
  • Learn about MATLAB's functions for calculating eigenvalues and matrix inversion
  • Explore techniques for diagnosing and addressing ill-conditioning in optimization problems
  • Investigate the implementation of nonlinear least squares in R as an alternative approach
USEFUL FOR

Data scientists, statisticians, and researchers involved in nonlinear regression analysis, particularly those using MATLAB for optimization tasks.

Jeffack
Messages
14
Reaction score
0
I am doing a nonlinear least squares estimation on a function of 14 variables (meaning that, to estimate ##y=f(x)##, I minimize ##\Sigma_i(y_i-(\hat x_i))^2## ). I do this using the quasi-Newton algorithm in MATLAB. This also gives the Hessian (matrix of second derivatives) at the minimizing point. My point estimates all seem reasonable, but the Hessian does not:

Every value in row ##5## and in column ##5## is zero, except for the entry at (##5, 5##), which is 1. Several of the other entries are also zero.

To find standard errors, you invert the Hessian and take the square root of the diagonals. When I do this, all of the estimates are near 1, and 4 of them are exactly 1.

I went and looked back at the function, and I couldn't see anything blatantly wrong. When I change the value of the 5th parameter, the value of the function changes (as it should); that is pretty much the end of my troubleshooting ability. I don't think the function is badly scaled; all of the parameters are between -2 and 4.

The last thing I should mention is that I had MATLAB calculate the eigenvalues of the Hessian. The first 13 of them were approximately zero, and the last one was 170,000.

Any idea what's going on here? I've calculated the Hessians for very similar functions and not had this issue.
 
Physics news on Phys.org
Jeffack said:
I am doing a nonlinear least squares estimation on a function of 14 variables (meaning that, to estimate ##y=f(x)##, I minimize ##\Sigma_i(y_i-(\hat x_i))^2## ).

Did you mean that you minimize ##\Sigma_i(y_i - \hat y_i)^2## ?

I don't understand whether "##y##" and "##x##" denote vectors or scalars.

If you have a scalar ##y## that is a function of 14 scalar variables ##x_1, x_2, ...x_{14} ## then are you minimizing ##\Sigma_j (y_j - f(x_{1,j}, x_{2,j}, x_{3,j},...x_{14,j} ))^2 ## where ##j## is the index for the ##j##th sample ?
 
Sorry about that. You are correct: ##y## is a n-by-1 vector, where n is the number of observations. ##x_1, x_2,...x_{14} ## are all n-by-1 vectors. ##y_i## is the ##i##th element of vector ##y##. ##x_{n,i}## is the ##i##th element of vector ##x_n##. The problem should have been written as you wrote it:

##\Sigma_j (y_j - f(x_{1,j}, x_{2,j}, x_{3,j},...x_{14,j} ))^2##
 
Without knowing some specifics, I can only suggest that you see what happens if you change some of the data you are using.

What is the function you are fitting to the data ?

Perhaps a MATLAB user can tell you what's going on if you show some of the MATLAB code.
 
Jeffack said:
I am doing a nonlinear least squares estimation on a function of 14 variables (meaning that, to estimate ##y=f(x)##, I minimize ##\Sigma_i(y_i-(\hat x_i))^2## ). I do this using the quasi-Newton algorithm in MATLAB. This also gives the Hessian (matrix of second derivatives) at the minimizing point. My point estimates all seem reasonable, but the Hessian does not:

Every value in row ##5## and in column ##5## is zero, except for the entry at (##5, 5##), which is 1. Several of the other entries are also zero.

To find standard errors, you invert the Hessian and take the square root of the diagonals. When I do this, all of the estimates are near 1, and 4 of them are exactly 1.

I went and looked back at the function, and I couldn't see anything blatantly wrong. When I change the value of the 5th parameter, the value of the function changes (as it should); that is pretty much the end of my troubleshooting ability. I don't think the function is badly scaled; all of the parameters are between -2 and 4.

The last thing I should mention is that I had MATLAB calculate the eigenvalues of the Hessian. The first 13 of them were approximately zero, and the last one was 170,000.

Any idea what's going on here? I've calculated the Hessians for very similar functions and not had this issue.
I believe Levenberg-Marquardt is more robust than quasi-Newton methods for computational nonlinear regression, and generally works better for ill-conditioned problems. Also, have you tried different methods for calculating the eigenvalues? Or even, for inverting your Hessian matrix? That 170,000 sounds like a very poor approximation of the inverse of the Hessian matrix, which could be caused by an ill-conditioned starting function, e.g. say your initial function is flat, then the calculations of the derivatives might be flawed and thus, further iterations done by the approximation of the inverse or even the eigenvalues might be flawed. You have to make sure you meet the (usually strict under experimental use) conditions variable metric algorithms require.

I would be curious to see the MATLAB code, however that's a problem I would definitely suggest solving in R!
 

Similar threads

  • · Replies 7 ·
Replies
7
Views
2K
  • · Replies 4 ·
Replies
4
Views
2K
  • · Replies 3 ·
Replies
3
Views
2K
  • · Replies 7 ·
Replies
7
Views
3K
  • · Replies 2 ·
Replies
2
Views
2K
  • · Replies 2 ·
Replies
2
Views
2K
  • · Replies 13 ·
Replies
13
Views
2K
  • · Replies 6 ·
Replies
6
Views
2K
Replies
24
Views
3K
  • · Replies 2 ·
Replies
2
Views
2K