# Local min no other zeros of gradient

1. Apr 23, 2014

### jostpuur

Assume that $f:\mathbb{R}^N\to\mathbb{R}$ is a differentiable function and that $x_0\in\mathbb{R}^N$ is a local minimum of $f$. Also assume that $N\geq 2$ and that the gradient of $f$ has no other zeros than the $x_0$. In other words

$$\nabla f(x)=0\quad\implies\quad x=x_0$$

Is the $x_0$ a global minimum?

2. Apr 23, 2014

### jbunniii

$$f(x,y) = (1 + 2x^2 - x^4)(1+y^2)$$
Then
$$\nabla f(x,y) = (4(x - x^3)(1+y^2), (1 + 2x^2 - x^4)(2y))$$
Therefore $\nabla f(x,y) = 0$ if and only if $(x,y) = (0,0)$. There is a local minimum at $(0,0)$. But there is no global minimum, as can be seen by setting $y=0$ and letting $x \rightarrow \infty$. There is also no global maximum, as can be seen by setting $x=0$ and letting $y \rightarrow \infty$.

Last edited: Apr 23, 2014
3. Apr 23, 2014

### jostpuur

In the attempt the points $(x,y)=(\pm 1,0)$ are saddle points where the gradient equals zero.

4. Apr 23, 2014

### jbunniii

Oops, you're right.

5. Apr 23, 2014

### jostpuur

I think I just managed to draw a picture of a counter example. We examine the two dimensional case and define $f(x,y)=y^2-x^2$ for $y\geq 1$, $f(x,y)=y^2+x^2$ for $y\leq 1-\frac{1}{2}x^2$, and then "invent something" to fill the gap $1-\frac{1}{2}x^2 < y < 1$.

First I tried to prove the "yes" answer, but after encountering some difficulties, I eventually started considering the possibility of some counter example. Then, at first I thought that a counter example would need to be very pathological, and didn't put much effort. Now I wouldn't be so pessimistic, although it still seem tricky to get the formulas working.

6. Apr 23, 2014

### Staff: Mentor

I don't see how this is supposed to become a counterexample. Where is your local minimum? (x=0, y=1) (the only special point in your function) is not a minimum, as every (x=epsilon, y=1) leads to a smaller function value.

I don't think such a function exists, but I don't have a mathematical proof.

7. Apr 24, 2014

### jostpuur

Origin (x,y)=(0,0) would be the local minimum, and the only place where gradient is zero. (x,y)=(0,1) would be the clever point where the function turns to reach locally smaller values without being a saddle point.

8. Apr 24, 2014

### Staff: Mentor

Ah okay. Nice idea.

For "invent something", I suggest:
$$f(x,y)=y^2 - \cos\left(2\pi \frac{1-y}{x^2}\right) x^2$$
The argument in the cos runs between 0 and π, the cos is 1 for the upper border and -1 for the lower border.

For y=1, this reduces to y^2-x^2 and gives the right upper border.
For y=1-x^2/2, this reduces to y^2+x^2 and gives the right lower border.

The y-derivative:
$$\partial_y f(x,y) = 2y - 2 \pi \sin\left(2\pi \frac{1-y}{x^2}\right)$$
This simply gives 2 for y=1 and for y=1-x^2/2 and therefore the borders fit.

The x-derivative:
$$\partial_x f(x,y) = \frac{4 \pi (1-y)}{x} \sin\left(2\pi \frac{1-y}{x^2}\right) - 2x \cos\left(2\pi \frac{1-y}{x^2}\right)$$

It fits at the border, as the sin is zero and the cos is -1 there.

By plotting it, it does not seem to have critical points in the relevant range, but I don't see a proof yet as setting the derivatives to zero gives problematic equations.

9. Apr 25, 2014

### jostpuur

It is very peculiar that your attempt has the right gradients at the borders. It seems that the cosine term was written only with the intent of getting +1 on $y=1$, and -1 on $y=1-\frac{1}{2}x^2$. Do the gradients fit due to luck or was there something deliberate about the attempt?

By the way I'm not sure if the factor $\frac{1}{2}$ was relevant in the definition of the border. When drawing the original picture I thought it would be natural to have the ball $x^2+y^2\leq 1$ under the parabola, but it doesn't seem very important anymore.

My further observations were these: (Although there is some chance for mistakes...)

It can be proven that the condition $\partial_x f=0$ defines a curve

$$1- y = \frac{\theta_0x^2}{2\pi}$$

where $\theta_0$ is a constant defined by conditions $\theta_0\tan(\theta_0)=1$ and $0\leq\theta_0\leq\pi$ (and $\theta_0\neq\frac{\pi}{2}$). The constant is roughly $\theta_0\approx 0.86033$. Then it suffices to prove that $\partial_y f\neq 0$ on the same curve.

$$\partial_y f = 2y- 2\pi\sin(\theta_0)$$

holds on the curve, and $\pi\sin(\theta_0)\approx 2.3815$ implies that $\partial_y f < 0$.

10. Apr 26, 2014

### Staff: Mentor

This was part of the construction. I looked for a smooth transition between y^2+x^2 and y^2-x^2 via a prefactor for the x^2, this prefactor has to go from -1 to +1 and its derivative has to be zero at the borders. The cos was a natural choice. Maybe a 3rd-order polynomial would be easier to analyze.

11. Apr 26, 2014

### jk22

The global minimum could also be at the boundary, or at the limit x towards infinity, however I suppose that if the derivative does not tend towards 0 then the minimum is -infinity.

12. May 5, 2014

### daverusin

I'm afraid the proposed solution does not really work: this function is not actually differentiable. I tried to figure out how on earth it could be working, by sketching the level curves. I reorganized the function a bit, putting the questionable point at the origin and trying to simplify: for nonnegative X let g(X,y) = (y-1)^2 + h(X,y)*X where h(X,y) = +1 when y>X/2, -1 when y<0, and cos(2pi y/X) otherwise. (The goal eventually was to let X=x^2 which will basically glue the picture of level curves of g to their mirror images to give the level curves of f, slightly squished.) The problematic point is now at the origin, and you know it's problematic because the level curve g=1 passes through it and has three components: parts of two different parabolas above and below the X-axis, and then also a curve that sort of snakes to the northeast from the origin. Specifically it leaves the origin following its tangent line there, y=mX with m approximately 0.39492 (the solution to cos(2pi t)=-2t) . You know there has to be such a curve, intermediate between all the simpler level curves for positive and negative values of the function g. Anyway in the neighborhood of a point where g is differentiable and g' is nonzero, the level curves should all be approximately parallel curves; there cannot be three branches emanating from one point.

So, is g a critical point? It's easy enough to check (from the limit definition) that dg/dX = -1 and dg/dy=-2, so this is clearly not a point where g' is zero. But just because both partials exist does not make g differentiable! Differentiability is the property of being approximated by a linear function, which in this case would have to be L(X,y) = -X-2y . In particular, the only direction one ought to be able to leave the origin while staying on a level curve is in the direction of <2,-1>. Indeed, that matches the level curve in the bottom of the picture (where g(X,y) = (y-1)^2-X) but not the other two curves.

So this turns out to be a very interesting example ("grad g exists does not imply g differentiable") but not an example of the proposed phenomenon. I didn't give it much thought but I think having only one local min really does force that point to be a global min. But one does have to be careful; for example one can have "two mountains without a valley" : f(x,y)=1 - (x^2-1)^2 - (x^2y-x-1)^2 .

dave

13. May 6, 2014

### Staff: Mentor

Hmm... I think you can avoid issues with differentiability if you take X=x^4. That suppresses x-dependencies at the critical point without changing other parts. We would have to verify that 1-2y approximates g in the way the derivative requires it (I would expect it does).

14. May 6, 2014

### daverusin

Here is a smooth function with only one critical point (at the origin), which is a local minimum but not a global minimum: f(x,y) = x^2 + y^2(1-x)^3. If you start filling the graph with water, the origin becomes the bottom of a roughly-triangular lake lying to the west of the line x=1. By the time the water has filled the graph to the height of 1, the lake includes the whole line x=1, stretching north and south to infinity. To the west the hills rise as would the sides of a bowl. A very narrow ridge rises to the east, too, but the ridge is surrounded to the north and south by water (forming two seas that asymptotically approach the x-axis far to the east). If you walk away from the origin NE or SE you will eventually enter the seas and then follow the seabed well below a depth of zero.

FWIW, the non-compactness of the domain appears to be kind of critical. This is the basis of the part of topology called "surgery" (and cobordism, I guess). Given a smooth function f : M --> R defined on a compact manifold, one can recover the topology of M by taping together the level curves (or more precisely the inverse images of intervals in R that contain none of the critical values). There is a fairly elementary treatment of these ideas in Chapter 5 of Andrew Wallace's book "Differential Topology".

15. May 6, 2014

### jostpuur

It is true that our example contained the mistake, that we forgot to prove its differentiability at $(x,y)=(0,1)$. If it is differentiable there, the gradient must be

$$\nabla f(0,1) = (0,2)$$

This means that the function is differentiable at this point if

$$f(x,y) = 1 + 2(y-1) + o(\|(x,y-1)\|)$$

holds when $(x,y)\to (0,1)$. This obviously holds for $1\leq y$ and $y\leq 1-\frac{1}{2}x^2$, so the remaining task is to prove that

$$y^2 - \cos\Big(2\pi \frac{1-y}{x^2}\Big)x^2= -1 + 2y + o(\|(x,y-1)\|)$$

holds for $1-\frac{1}{2}x^2<y<1$. The claim is equivalent to the claim

$$(y-1)^2 - \cos\Big(2\pi\frac{1-y}{x^2}\Big)x^2 = o(\|(x,y-1)\|)$$

Triangle inequality implies

$$\Big|(y-1)^2 - \cos\Big(2\pi\frac{1-y}{x^2}\Big)x^2\Big| \leq (y-1)^2 + x^2$$

so it is clear, and in fact the error term can be written in the form $O(\|(x,y-1)\|^2)$.

The explanation by daverusin was confusing to me, but to me it seems clear now that our function is differentiable everywhere.

Last edited: May 6, 2014
16. May 6, 2014

### jostpuur

Well that was surprising with its simplicity...

17. May 6, 2014

### Staff: Mentor

It uses the same concept - shown with lines of equal height, two lines touch each other. That allows the transition between the local minimum and the unbounded region.
It is nice to have such a simple example.

18. May 6, 2014

### daverusin

I stand corrected: jostpuur's example is indeed differentiable at the "nasty" spot; indeed, the simple inequality || f(x,y) - L(x,y) || <= ||v||^2 holds for all displacements v from the point (0,1). The multiple branches of the level curve through that point all leave the point in the same direction (east-west). The non-differentiability in my function g is smoothed out by the substitution X=x^2.

So I had another look at Wallace's book, which sure did make it seem like level curves could not bifurcate as this one does. I see now that Wallace assumes all functions are C^\infty, which imposes a much more regular behaviour than jostpuur's example, which is differentiable (has a linear approximation) but does not even have a quadratic approximation with ||f(x,y) - Q(x,y) || = o(||v||^3), which Wallace requires to get the pretty results.

PS -- I write calculus contests and am always on the lookout for surprising functions like these so feel free to send me examples!

19. May 6, 2014

### jostpuur

There is at least one thing different in these examples (besides the simplicity issue). My remark concerning the $\partial_y f$ in post #9 proves that the first example is not continuously differentiable at $(x,y)=(0,1)$. So in the light of the first example only, one might still consider that a continuously differentiable counter example couldn't exist, but this possibility became dealt with now.