Proving recursion relations. BFGS non linear optimization

sdevoe · Apr 29, 2012

Homework Statement

Please see attached thumbnail
Here's what I know.
1)B_k is the Hessian
2) s_k = [itex]\alpha[/itex]*p
3)p_k is the search direction
4) Alpha is the step size

Homework Equations

y_k = [itex]\nabla[/itex]f(x_k+1) -[itex]\nabla[/itex]f(x_k
B_k+1(x_k+1-x_k) = [itex]\nabla[/itex]f(x_k+1) -[itex]\nabla[/itex]f(x_k

The Attempt at a Solution

B_k+1(x_k+1-x_k) = y_k

and then somehow from there I have to use the above to prove the H_k+1 equation

Ray Vickson · Apr 29, 2012

sdevoe said:

Homework Statement

Please see attached thumbnail
Here's what I know.
1)B_k is the Hessian
2) s_k = [itex]\alpha[/itex]*p
3)p_k is the search direction
4) Alpha is the step size

Homework Equations

y_k = [itex]\nabla[/itex]f(x_k+1) -[itex]\nabla[/itex]f(x_k
B_k+1(x_k+1-x_k) = [itex]\nabla[/itex]f(x_k+1) -[itex]\nabla[/itex]f(x_k

The Attempt at a Solution

B_k+1(x_k+1-x_k) = y_k

and then somehow from there I have to use the above to prove the H_k+1 equation

You don't know that B_k is the Hessian; you only know that it is a current approximation to the Hessian. In a purely quadratic model with exact line searches, each line search produces a better approximation to the Hessian, so starting from B_0 = Identity matrix (for example), you will have B_k = exact Hessian for some k <= n; that is, in at most n line searches you will get the Hessian---all of this without ever computing a second derivative! Going from B_k to B_{k+1} is a rank-1 update. There are formulas for how to change the inverse after a rank-1 update. I don't remember them, but they are widely available in the linear algebra/nonlinear optimization literature.

RGV

Proving recursion relations. BFGS non linear optimization

Homework Statement

Homework Equations

The Attempt at a Solution

Attachments

Homework Statement

Homework Equations

The Attempt at a Solution

Similar threads

Distance between a Clock's hands when the distance is increasing most rapidly

Polar integral

Deriving spatial derivatives

Is this the correct general solution of the given PDE?

J_1(x) = (x^2/10)*(J_1(x) + J_3(x)) How to solve?

Insights Revisiting the Velocity-Time Function

Insights Remote Operated Gate Control System

Insights AI Enriched Problem Solving

Insights Thinking Outside The Box Versus Knowing What’s In The Box

Insights Why Entangled Photon-Polarization Qubits Violate Bell’s Inequality

Insights Quantum Entanglement is a Kinematic Fact, not a Dynamical Effect