Bivariate Function Optimization

Ocifer · Nov 20, 2012

Homework Statement

In general, I've been given a few functions of two variables, x and y. I have been asked to find all critical points by setting the gradient of the function equal to 0.

Further we are asked to classify these critical points using some given rules regarding the Hessian matrix.

Homework Equations

The Attempt at a Solution

For example, I'm given

[itex]f(x,y) = x^4 + y^4 − 4xy + 1[/itex]

By solving the gradient equation, I got the fixed points (0,0) , (1,1) , and (-1,-1).

When I look at the Hessian matrix evaluated at either (1,1) or (-1,-1), the Hessian is the same and is positive definite. From this I conclude that both (1,1) and (-1,-1) are local minima with f(1,1) = f(-1,-1) = -1

How can I show that this is also a global minimum?

Ocifer · Nov 20, 2012

As a follow up:

I've found a result in my notes stating that if the Hessian matrix is positive semidefinite for all (x,y) then any stationary point is a global minimizer. It's easy to show that the Hessian for this particular function is positive semidefinite for all (x,y) (I used principal minors technique).

From this I wish to conclude that (-1,-1) and (1,1) are also global minima, with function value -1 at these points.

My confusion is now about the point (0,0) which I also found to be a stationary point. By the theorem I used above, it should follow that (0,0) is also a global min, but f(0,0) = 1 which is greater than -1.

How can this be?

Here is the strict wording of the theorem I'm using:

"H(z) is positive semidefinite for all z ∈ S ⇒ [x is a global minimizer of f in S if and only if x is a stationary point of f ]"

Can someone explain this to me? (0,0) satisfies the gradient equations and so is a stationary point, but it's clearly not a global minimum? Am I misinterpreting the theorem?

haruspex · Nov 20, 2012

Ocifer said:

I've found a result in my notes stating that if the Hessian matrix is positive semidefinite for all (x,y) then any stationary point is a global minimizer.

The Hessian of your function is not positive semidefinite for all (x,y), is it? According to my calculations, it only has that property where |xy| ≥ 2/3.

Ray Vickson · Nov 20, 2012

Ocifer said:

As a follow up:

I've found a result in my notes stating that if the Hessian matrix is positive semidefinite for all (x,y) then any stationary point is a global minimizer. It's easy to show that the Hessian for this particular function is positive semidefinite for all (x,y) (I used principal minors technique).

From this I wish to conclude that (-1,-1) and (1,1) are also global minima, with function value -1 at these points.

My confusion is now about the point (0,0) which I also found to be a stationary point. By the theorem I used above, it should follow that (0,0) is also a global min, but f(0,0) = 1 which is greater than -1.

How can this be?

Here is the strict wording of the theorem I'm using:

"H(z) is positive semidefinite for all z ∈ S ⇒ [x is a global minimizer of f in S if and only if x is a stationary point of f ]"

Can someone explain this to me? (0,0) satisfies the gradient equations and so is a stationary point, but it's clearly not a global minimum? Am I misinterpreting the theorem?

The theorem is not true if S has boundaries, because then the minimum can occur on the boundary and not be a stationary point. Example: minimize f(x) = x such that x ≥ 0. The solution is at x = 0, where f'(x) > 0.

Also: the Hessian in your example is NOT positive semidefinite everywhere. The Hessian, H, has the form H = 4A, where
[tex]A = \pmatrix{3x^2&-1\\-1&3y^2}[/tex]
Here is a little theorem that you may find useful: in order that a symmetric nxn matrix A be positive semidefinite, it must be the case that: (1) all diagonal elements a_{ii} ≥ 0; and (2) if a_{ii} = 0 for some i, then a_{ij} = a_{ji} = 0 for all j (and that fixed i); that is, if a diagonal element is zero, all elements in the corresponding row and column must also be zero. (This is very easy to prove; just assume it does not hold, and show how to get a quadratic form x^T A x that can be > 0 for some x and < 0 for other x.)

So, along either the x or y-axes, we have x = 0 or y = 0, so A is not psd (because it has a nonzero off-diagonal element and a zero diagonal). In fact, in order for A to be psd we need
[tex]9 x^2 y^2 - 1 \geq 0 \Longrightarrow |x| |y| \geq 1/3.[/tex] Therefore, A is psd in the four regions {x > 0, y > 0, xy ≥ 1/3}, {x < 0, y < 0, xy ≥ 1/3}, {x > 0, y < 0, xy < -1/3} and {x < 0, y > 0, xy < -1/3}. Outside these four regions, A is indefinite.

To check if you have the global minima, just look at a simple lower bound b(x,y) on f(x,y), namely: for all (x,y) we have
[tex]f(x,y ) \geq x^4 + y^4 + 1 - 4|x| |y| \equiv b(x,y),[/tex]
so a global min of f is bounded from below by a global min of b. By symmetry, the four global minima of b are just reflections of the minimum of b in the first quadrant, where b = f. So we can just look at f in the first quadrant, and get the stationary point therein, which are at (0,0) or (1,1). To show that (1,1) is the global min of f in the first quadrant, you can solve the problem of minimizing f in the "indefinite" region R1 = {x ≥ 0, y ≥ 0, xy ≤ 1/3}, but since this has inequality constraints you must use the so-called Karush-Kuhn-Tucker conditions. (These are like Lagrange multiplier conditions, but with modifications to handle inequalities.) I took the lazy way out and just used a package to solve the problem. The solution is x = y = 1/sqrt(3), giving f = -1/9. Note that this point is on the boundary xy = 1/3 of the region and has grad f = <-8*sqrt(3)/9, -8*sqrt(3)/9>, which is nonzero.

It follows that the global min of b in the first quadrant is (1,1), which is therefore also a global min of f.

RGV

Ocifer · Nov 26, 2012

Thanks for your replies. I had made an arithmetic error when taking the 2nd derivatives, so my Hessian was off. Thank you for pointing out my error. It turns out the prof had poorly specified what he wanted with the question. He only cared about local results, but thank you for your feedback.

Bivariate Function Optimization

Homework Statement

Homework Equations

The Attempt at a Solution

Similar threads

Hi! Can someone explain about Differential Equations?

Deriving spatial derivatives

Is this the correct general solution of the given PDE?

What does "compute Aut(G)" mean?

J_1(x) = (x^2/10)*(J_1(x) + J_3(x)) How to solve?

Insights Revisiting the Velocity-Time Function

Insights Remote Operated Gate Control System

Insights AI Enriched Problem Solving

Insights Thinking Outside The Box Versus Knowing What’s In The Box

Insights Why Entangled Photon-Polarization Qubits Violate Bell’s Inequality

Insights Quantum Entanglement is a Kinematic Fact, not a Dynamical Effect