Ocifer said:
As a follow up:
I've found a result in my notes stating that if the Hessian matrix is positive semidefinite for all (x,y) then any stationary point is a global minimizer. It's easy to show that the Hessian for this particular function is positive semidefinite for all (x,y) (I used principal minors technique).
From this I wish to conclude that (-1,-1) and (1,1) are also global minima, with function value -1 at these points.
My confusion is now about the point (0,0) which I also found to be a stationary point. By the theorem I used above, it should follow that (0,0) is also a global min, but f(0,0) = 1 which is greater than -1.
How can this be?
Here is the strict wording of the theorem I'm using:
"H(z) is positive semidefinite for all z ∈ S ⇒ [x is a global minimizer of f in S if and only if x is a stationary point of f ]"
Can someone explain this to me? (0,0) satisfies the gradient equations and so is a stationary point, but it's clearly not a global minimum? Am I misinterpreting the theorem?
The theorem is not true if S has boundaries, because then the minimum can occur on the boundary and not be a stationary point. Example: minimize f(x) = x such that x ≥ 0. The solution is at x = 0, where f'(x) > 0.
Also: the Hessian in your example is NOT positive semidefinite everywhere. The Hessian, H, has the form H = 4A, where
A = \pmatrix{3x^2&-1\\-1&3y^2}
Here is a little theorem that you may find useful: in order that a symmetric nxn matrix A be positive semidefinite, it must be the case that: (1) all diagonal elements a_{ii} ≥ 0; and (2) if a_{ii} = 0 for some i, then a_{ij} = a_{ji} = 0 for all j (and that fixed i); that is, if a diagonal element is zero, all elements in the corresponding row and column must also be zero. (This is very easy to prove; just assume it does not hold, and show how to get a quadratic form x^T A x that can be > 0 for some x and < 0 for other x.)
So, along either the x or y-axes, we have x = 0 or y = 0, so A is not psd (because it has a nonzero off-diagonal element and a zero diagonal). In fact, in order for A to be psd we need
9 x^2 y^2 - 1 \geq 0 \Longrightarrow |x| |y| \geq 1/3. Therefore, A is psd in the four regions {x > 0, y > 0, xy ≥ 1/3}, {x < 0, y < 0, xy ≥ 1/3}, {x > 0, y < 0, xy < -1/3} and {x < 0, y > 0, xy < -1/3}. Outside these four regions, A is indefinite.
To check if you have the global minima, just look at a simple lower bound b(x,y) on f(x,y), namely: for all (x,y) we have
f(x,y ) \geq x^4 + y^4 + 1 - 4|x| |y| \equiv b(x,y),
so a global min of f is bounded from below by a global min of b. By symmetry, the four global minima of b are just reflections of the minimum of b in the first quadrant, where b = f. So we can just look at f in the first quadrant, and get the stationary point therein, which are at (0,0) or (1,1). To show that (1,1) is the global min of f in the first quadrant, you can solve the problem of minimizing f in the "indefinite" region R1 = {x ≥ 0, y ≥ 0, xy ≤ 1/3}, but since this has inequality constraints you must use the so-called Karush-Kuhn-Tucker conditions. (These are like Lagrange multiplier conditions, but with modifications to handle inequalities.) I took the lazy way out and just used a package to solve the problem. The solution is x = y = 1/sqrt(3), giving f = -1/9. Note that this point is on the boundary xy = 1/3 of the region and has grad f = <-8*sqrt(3)/9, -8*sqrt(3)/9>, which is nonzero.
It follows that the global min of b in the first quadrant is (1,1), which is therefore also a global min of f.
RGV