I Hessian of least squares estimate behaving strangely

AI Thread Summary
The discussion revolves around issues encountered during nonlinear least squares estimation of a function with 14 variables using MATLAB's quasi-Newton algorithm. The Hessian matrix shows unusual behavior, with many zero entries and a significant eigenvalue disparity, suggesting potential problems with model conditioning. Suggestions include considering the Levenberg-Marquardt method for better robustness and exploring alternative methods for calculating eigenvalues and inverting the Hessian. The user is encouraged to share MATLAB code for further insights and troubleshooting. The conversation highlights the importance of ensuring proper conditions for variable metric algorithms in nonlinear regression.
Jeffack
Messages
14
Reaction score
0
I am doing a nonlinear least squares estimation on a function of 14 variables (meaning that, to estimate ##y=f(x)##, I minimize ##\Sigma_i(y_i-(\hat x_i))^2## ). I do this using the quasi-Newton algorithm in MATLAB. This also gives the Hessian (matrix of second derivatives) at the minimizing point. My point estimates all seem reasonable, but the Hessian does not:

Every value in row ##5## and in column ##5## is zero, except for the entry at (##5, 5##), which is 1. Several of the other entries are also zero.

To find standard errors, you invert the Hessian and take the square root of the diagonals. When I do this, all of the estimates are near 1, and 4 of them are exactly 1.

I went and looked back at the function, and I couldn't see anything blatantly wrong. When I change the value of the 5th parameter, the value of the function changes (as it should); that is pretty much the end of my troubleshooting ability. I don't think the function is badly scaled; all of the parameters are between -2 and 4.

The last thing I should mention is that I had MATLAB calculate the eigenvalues of the Hessian. The first 13 of them were approximately zero, and the last one was 170,000.

Any idea what's going on here? I've calculated the Hessians for very similar functions and not had this issue.
 
Physics news on Phys.org
Jeffack said:
I am doing a nonlinear least squares estimation on a function of 14 variables (meaning that, to estimate ##y=f(x)##, I minimize ##\Sigma_i(y_i-(\hat x_i))^2## ).

Did you mean that you minimize ##\Sigma_i(y_i - \hat y_i)^2## ?

I don't understand whether "##y##" and "##x##" denote vectors or scalars.

If you have a scalar ##y## that is a function of 14 scalar variables ##x_1, x_2, ...x_{14} ## then are you minimizing ##\Sigma_j (y_j - f(x_{1,j}, x_{2,j}, x_{3,j},...x_{14,j} ))^2 ## where ##j## is the index for the ##j##th sample ?
 
Sorry about that. You are correct: ##y## is a n-by-1 vector, where n is the number of observations. ##x_1, x_2,...x_{14} ## are all n-by-1 vectors. ##y_i## is the ##i##th element of vector ##y##. ##x_{n,i}## is the ##i##th element of vector ##x_n##. The problem should have been written as you wrote it:

##\Sigma_j (y_j - f(x_{1,j}, x_{2,j}, x_{3,j},...x_{14,j} ))^2##
 
Without knowing some specifics, I can only suggest that you see what happens if you change some of the data you are using.

What is the function you are fitting to the data ?

Perhaps a MATLAB user can tell you what's going on if you show some of the MATLAB code.
 
Jeffack said:
I am doing a nonlinear least squares estimation on a function of 14 variables (meaning that, to estimate ##y=f(x)##, I minimize ##\Sigma_i(y_i-(\hat x_i))^2## ). I do this using the quasi-Newton algorithm in MATLAB. This also gives the Hessian (matrix of second derivatives) at the minimizing point. My point estimates all seem reasonable, but the Hessian does not:

Every value in row ##5## and in column ##5## is zero, except for the entry at (##5, 5##), which is 1. Several of the other entries are also zero.

To find standard errors, you invert the Hessian and take the square root of the diagonals. When I do this, all of the estimates are near 1, and 4 of them are exactly 1.

I went and looked back at the function, and I couldn't see anything blatantly wrong. When I change the value of the 5th parameter, the value of the function changes (as it should); that is pretty much the end of my troubleshooting ability. I don't think the function is badly scaled; all of the parameters are between -2 and 4.

The last thing I should mention is that I had MATLAB calculate the eigenvalues of the Hessian. The first 13 of them were approximately zero, and the last one was 170,000.

Any idea what's going on here? I've calculated the Hessians for very similar functions and not had this issue.
I believe Levenberg-Marquardt is more robust than quasi-Newton methods for computational nonlinear regression, and generally works better for ill-conditioned problems. Also, have you tried different methods for calculating the eigenvalues? Or even, for inverting your Hessian matrix? That 170,000 sounds like a very poor approximation of the inverse of the Hessian matrix, which could be caused by an ill-conditioned starting function, e.g. say your initial function is flat, then the calculations of the derivatives might be flawed and thus, further iterations done by the approximation of the inverse or even the eigenvalues might be flawed. You have to make sure you meet the (usually strict under experimental use) conditions variable metric algorithms require.

I would be curious to see the MATLAB code, however that's a problem I would definitely suggest solving in R!
 
I was reading a Bachelor thesis on Peano Arithmetic (PA). PA has the following axioms (not including the induction schema): $$\begin{align} & (A1) ~~~~ \forall x \neg (x + 1 = 0) \nonumber \\ & (A2) ~~~~ \forall xy (x + 1 =y + 1 \to x = y) \nonumber \\ & (A3) ~~~~ \forall x (x + 0 = x) \nonumber \\ & (A4) ~~~~ \forall xy (x + (y +1) = (x + y ) + 1) \nonumber \\ & (A5) ~~~~ \forall x (x \cdot 0 = 0) \nonumber \\ & (A6) ~~~~ \forall xy (x \cdot (y + 1) = (x \cdot y) + x) \nonumber...
Back
Top