Bivariate Function Optimization

Click For Summary

Homework Help Overview

The discussion revolves around the optimization of a bivariate function, specifically focusing on finding critical points and classifying them using the Hessian matrix. The original poster presents the function f(x,y) = x^4 + y^4 − 4xy + 1 and discusses the identification of critical points and their classification as local minima.

Discussion Character

  • Exploratory, Assumption checking, Conceptual clarification

Approaches and Questions Raised

  • Participants explore the process of finding critical points by setting the gradient to zero and evaluating the Hessian matrix. Questions arise regarding the classification of stationary points, particularly the distinction between local and global minima, and the implications of the Hessian's properties.

Discussion Status

The discussion is active, with participants providing insights into the properties of the Hessian matrix and its implications for the classification of critical points. Some participants question the assumptions made about the Hessian's definiteness and its role in determining global minima, while others clarify the conditions under which the theorem regarding positive semidefiniteness applies.

Contextual Notes

There is mention of potential confusion regarding the interpretation of the theorem related to the Hessian matrix and its implications for stationary points, particularly in the context of boundaries and local versus global minima. Additionally, some participants note errors in calculations that affect the classification of the Hessian.

Ocifer
Messages
30
Reaction score
0

Homework Statement


In general, I've been given a few functions of two variables, x and y. I have been asked to find all critical points by setting the gradient of the function equal to 0.

Further we are asked to classify these critical points using some given rules regarding the Hessian matrix.

Homework Equations



The Attempt at a Solution



For example, I'm given

f(x,y) = x^4 + y^4 − 4xy + 1

By solving the gradient equation, I got the fixed points (0,0) , (1,1) , and (-1,-1).

When I look at the Hessian matrix evaluated at either (1,1) or (-1,-1), the Hessian is the same and is positive definite. From this I conclude that both (1,1) and (-1,-1) are local minima with f(1,1) = f(-1,-1) = -1

How can I show that this is also a global minimum?
 
Physics news on Phys.org
As a follow up:

I've found a result in my notes stating that if the Hessian matrix is positive semidefinite for all (x,y) then any stationary point is a global minimizer. It's easy to show that the Hessian for this particular function is positive semidefinite for all (x,y) (I used principal minors technique).

From this I wish to conclude that (-1,-1) and (1,1) are also global minima, with function value -1 at these points.

My confusion is now about the point (0,0) which I also found to be a stationary point. By the theorem I used above, it should follow that (0,0) is also a global min, but f(0,0) = 1 which is greater than -1.

How can this be?

Here is the strict wording of the theorem I'm using:

"H(z) is positive semidefinite for all z ∈ S ⇒ [x is a global minimizer of f in S if and only if x is a stationary point of f ]"

Can someone explain this to me? (0,0) satisfies the gradient equations and so is a stationary point, but it's clearly not a global minimum? Am I misinterpreting the theorem?
 
Ocifer said:
I've found a result in my notes stating that if the Hessian matrix is positive semidefinite for all (x,y) then any stationary point is a global minimizer.
The Hessian of your function is not positive semidefinite for all (x,y), is it? According to my calculations, it only has that property where |xy| ≥ 2/3.
 
Ocifer said:
As a follow up:

I've found a result in my notes stating that if the Hessian matrix is positive semidefinite for all (x,y) then any stationary point is a global minimizer. It's easy to show that the Hessian for this particular function is positive semidefinite for all (x,y) (I used principal minors technique).

From this I wish to conclude that (-1,-1) and (1,1) are also global minima, with function value -1 at these points.

My confusion is now about the point (0,0) which I also found to be a stationary point. By the theorem I used above, it should follow that (0,0) is also a global min, but f(0,0) = 1 which is greater than -1.

How can this be?

Here is the strict wording of the theorem I'm using:

"H(z) is positive semidefinite for all z ∈ S ⇒ [x is a global minimizer of f in S if and only if x is a stationary point of f ]"

Can someone explain this to me? (0,0) satisfies the gradient equations and so is a stationary point, but it's clearly not a global minimum? Am I misinterpreting the theorem?

The theorem is not true if S has boundaries, because then the minimum can occur on the boundary and not be a stationary point. Example: minimize f(x) = x such that x ≥ 0. The solution is at x = 0, where f'(x) > 0.

Also: the Hessian in your example is NOT positive semidefinite everywhere. The Hessian, H, has the form H = 4A, where
A = \pmatrix{3x^2&-1\\-1&3y^2}
Here is a little theorem that you may find useful: in order that a symmetric nxn matrix A be positive semidefinite, it must be the case that: (1) all diagonal elements a_{ii} ≥ 0; and (2) if a_{ii} = 0 for some i, then a_{ij} = a_{ji} = 0 for all j (and that fixed i); that is, if a diagonal element is zero, all elements in the corresponding row and column must also be zero. (This is very easy to prove; just assume it does not hold, and show how to get a quadratic form x^T A x that can be > 0 for some x and < 0 for other x.)

So, along either the x or y-axes, we have x = 0 or y = 0, so A is not psd (because it has a nonzero off-diagonal element and a zero diagonal). In fact, in order for A to be psd we need
9 x^2 y^2 - 1 \geq 0 \Longrightarrow |x| |y| \geq 1/3. Therefore, A is psd in the four regions {x > 0, y > 0, xy ≥ 1/3}, {x < 0, y < 0, xy ≥ 1/3}, {x > 0, y < 0, xy < -1/3} and {x < 0, y > 0, xy < -1/3}. Outside these four regions, A is indefinite.

To check if you have the global minima, just look at a simple lower bound b(x,y) on f(x,y), namely: for all (x,y) we have
f(x,y ) \geq x^4 + y^4 + 1 - 4|x| |y| \equiv b(x,y),
so a global min of f is bounded from below by a global min of b. By symmetry, the four global minima of b are just reflections of the minimum of b in the first quadrant, where b = f. So we can just look at f in the first quadrant, and get the stationary point therein, which are at (0,0) or (1,1). To show that (1,1) is the global min of f in the first quadrant, you can solve the problem of minimizing f in the "indefinite" region R1 = {x ≥ 0, y ≥ 0, xy ≤ 1/3}, but since this has inequality constraints you must use the so-called Karush-Kuhn-Tucker conditions. (These are like Lagrange multiplier conditions, but with modifications to handle inequalities.) I took the lazy way out and just used a package to solve the problem. The solution is x = y = 1/sqrt(3), giving f = -1/9. Note that this point is on the boundary xy = 1/3 of the region and has grad f = <-8*sqrt(3)/9, -8*sqrt(3)/9>, which is nonzero.

It follows that the global min of b in the first quadrant is (1,1), which is therefore also a global min of f.

RGV
 
Thanks for your replies. I had made an arithmetic error when taking the 2nd derivatives, so my Hessian was off. Thank you for pointing out my error. It turns out the prof had poorly specified what he wanted with the question. He only cared about local results, but thank you for your feedback.
 

Similar threads

Replies
4
Views
2K
  • · Replies 8 ·
Replies
8
Views
2K
  • · Replies 3 ·
Replies
3
Views
2K
Replies
1
Views
2K
  • · Replies 8 ·
Replies
8
Views
2K
  • · Replies 24 ·
Replies
24
Views
2K
  • · Replies 2 ·
Replies
2
Views
2K
  • · Replies 8 ·
Replies
8
Views
2K
  • · Replies 11 ·
Replies
11
Views
3K
  • · Replies 2 ·
Replies
2
Views
2K