Bivariate Function Optimization

Ocifer
Messages
30
Reaction score
0

Homework Statement


In general, I've been given a few functions of two variables, x and y. I have been asked to find all critical points by setting the gradient of the function equal to 0.

Further we are asked to classify these critical points using some given rules regarding the Hessian matrix.

Homework Equations



The Attempt at a Solution



For example, I'm given

f(x,y) = x^4 + y^4 − 4xy + 1

By solving the gradient equation, I got the fixed points (0,0) , (1,1) , and (-1,-1).

When I look at the Hessian matrix evaluated at either (1,1) or (-1,-1), the Hessian is the same and is positive definite. From this I conclude that both (1,1) and (-1,-1) are local minima with f(1,1) = f(-1,-1) = -1

How can I show that this is also a global minimum?
 
Physics news on Phys.org
As a follow up:

I've found a result in my notes stating that if the Hessian matrix is positive semidefinite for all (x,y) then any stationary point is a global minimizer. It's easy to show that the Hessian for this particular function is positive semidefinite for all (x,y) (I used principal minors technique).

From this I wish to conclude that (-1,-1) and (1,1) are also global minima, with function value -1 at these points.

My confusion is now about the point (0,0) which I also found to be a stationary point. By the theorem I used above, it should follow that (0,0) is also a global min, but f(0,0) = 1 which is greater than -1.

How can this be?

Here is the strict wording of the theorem I'm using:

"H(z) is positive semidefinite for all z ∈ S ⇒ [x is a global minimizer of f in S if and only if x is a stationary point of f ]"

Can someone explain this to me? (0,0) satisfies the gradient equations and so is a stationary point, but it's clearly not a global minimum? Am I misinterpreting the theorem?
 
Ocifer said:
I've found a result in my notes stating that if the Hessian matrix is positive semidefinite for all (x,y) then any stationary point is a global minimizer.
The Hessian of your function is not positive semidefinite for all (x,y), is it? According to my calculations, it only has that property where |xy| ≥ 2/3.
 
Ocifer said:
As a follow up:

I've found a result in my notes stating that if the Hessian matrix is positive semidefinite for all (x,y) then any stationary point is a global minimizer. It's easy to show that the Hessian for this particular function is positive semidefinite for all (x,y) (I used principal minors technique).

From this I wish to conclude that (-1,-1) and (1,1) are also global minima, with function value -1 at these points.

My confusion is now about the point (0,0) which I also found to be a stationary point. By the theorem I used above, it should follow that (0,0) is also a global min, but f(0,0) = 1 which is greater than -1.

How can this be?

Here is the strict wording of the theorem I'm using:

"H(z) is positive semidefinite for all z ∈ S ⇒ [x is a global minimizer of f in S if and only if x is a stationary point of f ]"

Can someone explain this to me? (0,0) satisfies the gradient equations and so is a stationary point, but it's clearly not a global minimum? Am I misinterpreting the theorem?

The theorem is not true if S has boundaries, because then the minimum can occur on the boundary and not be a stationary point. Example: minimize f(x) = x such that x ≥ 0. The solution is at x = 0, where f'(x) > 0.

Also: the Hessian in your example is NOT positive semidefinite everywhere. The Hessian, H, has the form H = 4A, where
A = \pmatrix{3x^2&-1\\-1&3y^2}
Here is a little theorem that you may find useful: in order that a symmetric nxn matrix A be positive semidefinite, it must be the case that: (1) all diagonal elements a_{ii} ≥ 0; and (2) if a_{ii} = 0 for some i, then a_{ij} = a_{ji} = 0 for all j (and that fixed i); that is, if a diagonal element is zero, all elements in the corresponding row and column must also be zero. (This is very easy to prove; just assume it does not hold, and show how to get a quadratic form x^T A x that can be > 0 for some x and < 0 for other x.)

So, along either the x or y-axes, we have x = 0 or y = 0, so A is not psd (because it has a nonzero off-diagonal element and a zero diagonal). In fact, in order for A to be psd we need
9 x^2 y^2 - 1 \geq 0 \Longrightarrow |x| |y| \geq 1/3. Therefore, A is psd in the four regions {x > 0, y > 0, xy ≥ 1/3}, {x < 0, y < 0, xy ≥ 1/3}, {x > 0, y < 0, xy < -1/3} and {x < 0, y > 0, xy < -1/3}. Outside these four regions, A is indefinite.

To check if you have the global minima, just look at a simple lower bound b(x,y) on f(x,y), namely: for all (x,y) we have
f(x,y ) \geq x^4 + y^4 + 1 - 4|x| |y| \equiv b(x,y),
so a global min of f is bounded from below by a global min of b. By symmetry, the four global minima of b are just reflections of the minimum of b in the first quadrant, where b = f. So we can just look at f in the first quadrant, and get the stationary point therein, which are at (0,0) or (1,1). To show that (1,1) is the global min of f in the first quadrant, you can solve the problem of minimizing f in the "indefinite" region R1 = {x ≥ 0, y ≥ 0, xy ≤ 1/3}, but since this has inequality constraints you must use the so-called Karush-Kuhn-Tucker conditions. (These are like Lagrange multiplier conditions, but with modifications to handle inequalities.) I took the lazy way out and just used a package to solve the problem. The solution is x = y = 1/sqrt(3), giving f = -1/9. Note that this point is on the boundary xy = 1/3 of the region and has grad f = <-8*sqrt(3)/9, -8*sqrt(3)/9>, which is nonzero.

It follows that the global min of b in the first quadrant is (1,1), which is therefore also a global min of f.

RGV
 
Thanks for your replies. I had made an arithmetic error when taking the 2nd derivatives, so my Hessian was off. Thank you for pointing out my error. It turns out the prof had poorly specified what he wanted with the question. He only cared about local results, but thank you for your feedback.
 
There are two things I don't understand about this problem. First, when finding the nth root of a number, there should in theory be n solutions. However, the formula produces n+1 roots. Here is how. The first root is simply ##\left(r\right)^{\left(\frac{1}{n}\right)}##. Then you multiply this first root by n additional expressions given by the formula, as you go through k=0,1,...n-1. So you end up with n+1 roots, which cannot be correct. Let me illustrate what I mean. For this...
Back
Top