Exploiting Directions of Negative Curvature

brydustin · Sep 24, 2011

The title of an old paper... It mentions that in order to use the full information of a hessian in 2nd order optimization that you should make a part of your iterative step to include v (eigenvector corresponding to smallest eigenvalue, assuming that the eigenvalue is negative).
By doing the following: p = -sign(g'*v)*v : where g is the gradient. So here is the question, what is the geometrical meaning of the dot product of {g,v}? Because the idea is to find a local minimum but I'm trying to find a local maximum and would like to use similar information. Another condition for a local minimum would be that all the eigenvalues are positive, so in my case I would want all of them to be negative. So in my case would I set
p = + or - sign(g'*w)*w, where w is the eigenvalue corresponding to the largest eigenvalue (assuming that its also greater than 0 -- obviously if max(eigenvalue) < 0 then hessian is sufficiently conditioned to find a maximizer. Anyway, I appreciate any help on this... which sign do I pick and why (what's the geometry behind it?)
Thanks

AlephZero · Sep 24, 2011

Finding the minimum of x is the same problem as finding the maximum of -x.

That should be all you need to answer your questions about signs.

brydustin · Sep 24, 2011

Okay... but that doesn't actually answer my question.
What is the geometrical meaning behind dot(gradient, eigenvector of smallest eigenvalue), simply saying to flip the signs always makes no sense. Maybe in my case, -sgn(dot(g,eigenvector))*eigenvector STILL makes sense because of the sign of the eigenvector, but I don't know. The crux of the question is about geometry and not a naive change of sign. You don't change the sign mindlessly, for example, when solving g +Hd=0 you don't suddenly say d = inv(H)*g. My question is one of geometry.

Exploiting Directions of Negative Curvature

Thread 'How does time derivative commute from one variable to another?'

Similar threads

Hot Threads

I Algebraic property of real numbers

I Problem in understanding instantaneous velocity

I How to find the path if we only know the velocity (without common formulas)?

I Harmonic series Ʃ1/n diverges but p-series Ʃ(1/n)^p diverges?

I Explicit logical justification for last step in epsilon/delta proof?

Recent Insights

Insights Quantum Entanglement is a Kinematic Fact, not a Dynamical Effect

Insights What Exactly is Dirac’s Delta Function? - Insight

Insights Relativator (Circular Slide-Rule): Simulated with Desmos - Insight

Insights Fixing Things Which Can Go Wrong With Complex Numbers

Insights Fermat's Last Theorem

Insights Why Vector Spaces Explain The World: A Historical Perspective