Exploiting Directions of Negative Curvature

  • Context: Graduate 
  • Thread starter Thread starter brydustin
  • Start date Start date
  • Tags Tags
    Curvature Negative
Click For Summary
SUMMARY

The discussion focuses on the application of Hessian matrices in second-order optimization, specifically regarding the use of eigenvectors corresponding to negative eigenvalues for finding local maxima. The iterative step is defined as p = -sign(g'*v)*v, where g represents the gradient and v is the eigenvector associated with the smallest eigenvalue. The geometric interpretation of the dot product between the gradient and the eigenvector is questioned, particularly in the context of maximizing functions. The necessity of understanding the geometry behind these operations is emphasized, especially when considering the signs of the eigenvectors.

PREREQUISITES
  • Understanding of Hessian matrices in optimization
  • Familiarity with eigenvalues and eigenvectors
  • Knowledge of gradient descent and ascent methods
  • Basic concepts of local minima and maxima in multivariable calculus
NEXT STEPS
  • Research the geometric interpretation of eigenvectors in optimization contexts
  • Study the implications of negative curvature in Hessian matrices
  • Learn about gradient ascent techniques for maximizing functions
  • Explore the relationship between eigenvalues and the stability of optimization solutions
USEFUL FOR

Mathematicians, data scientists, and machine learning practitioners interested in advanced optimization techniques and the geometric foundations of Hessian-based methods.

brydustin
Messages
201
Reaction score
0
The title of an old paper... It mentions that in order to use the full information of a hessian in 2nd order optimization that you should make a part of your iterative step to include v (eigenvector corresponding to smallest eigenvalue, assuming that the eigenvalue is negative).
By doing the following: p = -sign(g'*v)*v : where g is the gradient. So here is the question, what is the geometrical meaning of the dot product of {g,v}? Because the idea is to find a local minimum but I'm trying to find a local maximum and would like to use similar information. Another condition for a local minimum would be that all the eigenvalues are positive, so in my case I would want all of them to be negative. So in my case would I set
p = + or - sign(g'*w)*w, where w is the eigenvalue corresponding to the largest eigenvalue (assuming that its also greater than 0 -- obviously if max(eigenvalue) < 0 then hessian is sufficiently conditioned to find a maximizer. Anyway, I appreciate any help on this... which sign do I pick and why (what's the geometry behind it?)
Thanks
 
Physics news on Phys.org
Finding the minimum of x is the same problem as finding the maximum of -x.

That should be all you need to answer your questions about signs.
 
Okay... but that doesn't actually answer my question.
What is the geometrical meaning behind dot(gradient, eigenvector of smallest eigenvalue), simply saying to flip the signs always makes no sense. Maybe in my case, -sgn(dot(g,eigenvector))*eigenvector STILL makes sense because of the sign of the eigenvector, but I don't know. The crux of the question is about geometry and not a naive change of sign. You don't change the sign mindlessly, for example, when solving g +Hd=0 you don't suddenly say d = inv(H)*g. My question is one of geometry.
 

Similar threads

Replies
4
Views
2K
  • · Replies 33 ·
2
Replies
33
Views
3K
  • · Replies 58 ·
2
Replies
58
Views
5K
  • · Replies 7 ·
Replies
7
Views
2K
  • · Replies 8 ·
Replies
8
Views
2K
  • · Replies 1 ·
Replies
1
Views
6K
  • · Replies 0 ·
Replies
0
Views
3K
  • · Replies 2 ·
Replies
2
Views
3K
  • · Replies 5 ·
Replies
5
Views
3K
  • · Replies 2 ·
Replies
2
Views
3K