Weird theorem on critical points for multivariable functions

In summary, the theorem states that if a function has discontinuities at certain points, the derivatives at those points are not just the partial derivatives. The gradient of the function at those points is also given.
  • #1
Nikitin
735
27
Why does it write all that stuff about fxy2? Isn't it unnecessary?

I mean, isn't the following true?

If in a point (a,b): fx=fy=0 and fxxfyy>0, then the partial derivatives must be either both negative or positive, and thus point (a,b) is a local minima or maxima. And if fxxfyy<0 => (a,b) is a saddlepoint.

Right? So why bring in the fxy2?
 

Attachments

  • bilde(2).JPG
    bilde(2).JPG
    31.3 KB · Views: 446
Physics news on Phys.org
  • #2
[itex] f(x,y) = x^2 + 100xy +y^2 [/itex]

This has a saddle point at the origin. If you look at f(t,t) you get 102t^2, which is positive. If you look at f(t,-t) you get -98x^2 which is negative. So you have both positive and negative values near the origin.
 
  • #3
Hmm, thanks. But is there an intuitive explanation for the theorem?
 
  • #4
A more precise way of thinking about it is this: if f(x,y) is a function of two variables, the its "derivative" is NOT just the partial derivatives, [itex]\partial f/\partial x[/itex] and [itex]\partial f/\partial y[/itex] (it is possible for those partial derivatives to exist at a point while f itself is not even continuous, much less differentiable), but the gradient [itex]\nabla f= (\partial f/\partial x)\vec{i}+ (\partial f/\partial y)\vec{j}[/itex]. (Even more precisely, the derivative is the linear transformation, at each point, given by the dot product [itex](\partial f/\partial x)\vec{i}+ (\partial f/\partial y)\vec{j})\cdot ((x- x_0)\vec{i}+ (y- y_0)\vec{j})[/itex] but that is "represented" by the gradient vector.)

In that sense the second derivative is given by the linear transformation "represented", at each point, by the matrix
[tex]\begin{bmatrix}\frac{\partial^2 f}{\partial x^2} & \frac{\partial^2 f}{\partial x\partial y} \\ \frac{\partial^2 f}{\partial y\partial x} & \frac{\partial^2 f}{\partial y^2}\end{bmatrix}[/tex]

Because, as long as f has continuous derivatives, the "mixed" second derivatives are the same, that is a symmetric matrix and so has real eigenvalues and two independent eigenvectors. If we were to use the directions of those eigenvectors as coordinate lines, x' and y', the matrix representing the second derivative would be "diagonal":
[tex]\begin{bmatrix}\frac{\partial^2 f}{\partial x'^2} & 0 \\ 0 &\frac{\partial^2 f}{\partial y^2}\end{bmatrix}[/tex]
where those two derivatives (evaluated at the given point) are the "eigenvalues" of the original second derivative matrix.

Now, at a point where the first derivatives are 0 (a critical point) and the "mixed" second derivatives are 0, as in the x', y' coordinate system, we can write [itex]f(x)= f(x_0, y_0)+ f_{xx}(x_0,y_0)(x- x_0)^2+ f_{yy}(x_0, y_0)(y- y_0)^2[/itex] to second degree. And it is easy to see from this that:
1) if [itex]f_{xx}(x_0, y_0)= a[/itex] and [itex]f_{yy}(x_0, y_0)= b[/itex] are both positive, we have [itex]f(x, y)= f(x_0, y_0)+ a(x- x_0)^2+ (y- y_0)^2[/itex] so that [itex](x_0, y_0)[/itex] is a local "minimum".
2) if [itex]f_{xx}(x_0, y_0)= -a[/itex] and [itex]f_{yy}(x_0, y_0)= -b[/itex] are both negative, we have [itex]f(x, y)= f(x_0, y_0)- a(x- x_0)^2- b(y- y_0)^2[/itex] so that [itex](x_0, y_0)[/itex] is a local "maximum".
3) if [itex]f_{xx}(x_0, y_0)= a[/itex] and [itex]f_{yy}(x_0, y_0)= -b[/itex] are one positive and the other negative, we have [itex]f(x, y)= f(x_0, y_0)+ a(x- x_0)^2- b(y- y_0)^2[/itex] so that [itex](x_0, y_0)[/itex] is a local "saddle point".

So the question is about the eigenvalues of that two by two matrices. If both are positive, the point is a local minimum, if both are negative, a local maximum, and if they are of different sign, a saddle point (of course, just like in the one variable situation, if either is 0, this does not tell us). Further the determinant is independent of the coordinate system- the two determinants:
[tex]\left|\begin{array}{cc}\frac{\partial^2 f}{\partial x'^2} & 0 \\ 0 & \frac{\partial^2 f}{\partial y'^2}\end{array}\right|= f_{x'x'}f_{y'y'}[/tex]
[tex]\left|\begin{array}{cc}\frac{\partial^2 f}{\partial x^2} & \frac{\partial^2 f}{\partial x\partial y} \\ \frac{\partial^2 f}{\partial y\partial x} & \frac{\partial^2 f}{\partial y^2}\end{array}\right|= f_{xx}f_{xy}- (f_{xy})^2[/tex]
are the same- both eigenvalues, and so second derivatives, are the same sign and so we have either a minimum or a maximum, if and only if [itex]f_{xx}f_{yy}- (f_{xy})^2> 0[/itex] and the second derivatives are of different sign, and so we have a saddle point, if and only if [itex]f_{xx}f_{yy}- (f_{xy})^2< 0[/itex].

This also shows why we do not have a similar formula for three or more variables- all the analysis, to the diagonal matrix goes through but the determinant is a product of three or more numbers and its sign does not tell us about the sign of the individual eigenvalues. If, in the three variable case, the product is positive, it might be that all three eigenvalues are positive or that one is positive and the other two negative.
 
Last edited by a moderator:
  • #5
thanks!
 

1. What is the "Weird theorem on critical points for multivariable functions"?

The "Weird theorem on critical points for multivariable functions" is a mathematical theorem that relates to the critical points of a multivariable function. It states that for a function of two variables, if the first and second partial derivatives are both zero at a point, then that point is a critical point. However, not all critical points are found using this method.

2. How is this theorem different from other theorems about critical points?

Unlike other theorems about critical points, the "Weird theorem" does not guarantee that all critical points will be found using the first and second partial derivatives. It only guarantees that all critical points found using this method are indeed critical points. This is why it is referred to as the "Weird" theorem.

3. What is the significance of this theorem in multivariable calculus?

The "Weird theorem" is significant because it highlights the limitations of using the first and second partial derivatives to find critical points in multivariable functions. It shows that there may be critical points that cannot be found using this method and therefore, alternative methods must be used.

4. Can you provide an example of how this theorem is applied?

Sure, for a function f(x,y) = x^2 + y^2, the first and second partial derivatives are both zero at (0,0). However, this point is not a critical point as the function has a minimum value of 0 at this point. Therefore, this theorem would not identify (0,0) as a critical point, but it is indeed a critical point.

5. Are there any real-life applications of this theorem?

Yes, the "Weird theorem" can be applied in various fields such as economics, physics, and engineering, where multivariable functions are commonly used. It can help in identifying critical points in these functions, which can have practical implications in decision-making processes.

Similar threads

Replies
3
Views
1K
Replies
2
Views
960
Replies
1
Views
1K
Replies
3
Views
2K
  • Calculus
Replies
4
Views
2K
Replies
18
Views
2K
Replies
17
Views
1K
  • Calculus
Replies
4
Views
1K
Back
Top