Visualizing second derivative test - Hessian

In summary, in the case of second derivative test in the two-variable case, we are looking at the behavior of the function in the critical point by approximating it with a polynomial function. If the Hessian determinant is negative, then the function has a saddle point, meaning it is increasing in some directions and decreasing in others. This can be seen by looking at the quadratic form of the polynomial approximation and noting that the sign of the coefficient of the quadratic term determines the shape of the parabola and thus the behavior of the function.
  • #1
Phoeniyx
16
1
Hey guys. I am having some trouble visualizing one aspect of the Second derivative test in the 2 variable case (related to #3 below). Essentially, what does the curve look like when [itex]f_{xx}f_{yy} > 0[/itex], BUT [itex]f_{xx}f_{yy} < [f_{xy}]^{2}[/itex]?

To be more detailed, if the function is f(x,y), H(x,y) is the Hessian matrix of f and D is the determinant of H, where [itex]D = Det(H(x,y)) = f_{xx}f_{yy} - [f_{xy}]^{2} [/itex]

1) If D(a, b) > 0, and [itex]f_(xx)(a,b) > 0[/itex] => local minimum
2) If D(a, b) > 0, and [itex]f_(xx)(a,b) < 0[/itex] => local maximum
3) If D(a, b) < 0 => saddle point

I can totally see why [itex]f_{xx}[/itex] and [itex]f_{yy}[/itex] must have the same sign for there to be a max or a minimum - but I DON'T see why the product has to be "greater" than the square of [itex]f_{xy}[/itex] (as opposed to just 0) to have a max or min.

Thanks guys. Much appreciated.
 
Physics news on Phys.org
  • #2
It's the sign of D that matters not fxx,fyy, or fxy individually.

If D < 0 that is fxxfyy < (fxy)^2 you have a saddle point.

For an example, consdier f(x,y) = x^2 - y^2, a hyperbolic paraboloid (or more informally a saddle). Think of it as a pringle chip.

Find the critical points

fx = 2x = 0
fy = -2y = 0
fxx = 2, fyy = -2, fxy = 0. D = -4 < 0.

So (0,0) is the critical point. If analyze the behavior fx is concave up and fy is concave down, so it can't be either a local maximum or minimum, so it has to be a saddle point. Having D < 0 gives you a local min in some directions and a local max in others, so we call it a saddle point.

If you want a more thorough understanding just search up a proof for the second derivatives test and in the D<0 it's always some form of local mins in some directions and local maxes in others.
 
  • #3
Thanks. But what I meant was what happens when [itex]f_{xx}[/itex] and [itex]f_{yy}[/itex] have the same sign - but their product is less than [itex][f_{xy}]^2[/itex]. So, your example would not work since 2 and -2 have opposite signs.

Essentially, what I am asking is, what does it mean for a curve to have [itex]f_{xx} = 2[/itex], [itex]f_{yy} = 1[/itex], and [itex][f_{xy}]^2 = 4[/itex]. Or something similar. i.e. all of them have the same signs, but [itex]D = -2 < 0[/itex].

Thanks again.
 
  • Like
Likes Joker93
  • #4
The same thing happens. Since D<0 you have a saddle point, in some directions you have local maxes and in others you have local mins. That is when D<0 you don't even bother to check the sign of fxx because D<0 => saddle point. The example I chose was meant to be illustrative of where the name "saddle" is coming from.

What it means is if at the critical point (a,b), D(a,b) < 0, then in some directions you will points with values larger than (a,b) in others you will have points with values smaller than (a,b) so (a,b) can't be a relative extrema (which is visualized by thinking of a saddle), so you can't call it a relative max or min, so we call it a saddle point.

Edit: Seem's like your contention is with the structure of the Hessian Determinant. You are questioning the relevance of fxy. In calculus of single variables you checked if something was a local max or min by taking the second derivative and f''() < 0 told you local max and f''() > 0 told you local min, in multivariable calculus you have second order partial derivatives, and there are four possibilities fxx, fyy , fxy, and fyx, but by Clauriout's under most conditions fxy = fyx. That may provide some solace to why fxy plays a role here.

Why can you conclude D < 0 implies in certain directions you are increasing and in others you are decreasing, comes back to the proof and where the D is coming from. But the statement in itself means exactly that when D(a,b) < 0 you are both increasing and decreasing in a sense.

Also: The last paragraph of this section addresses the geometric interpretation of the case when fxxfyy < (fxy)^2.

http://en.wikipedia.org/wiki/Second...etric_interpretation_in_the_two-variable_case
 
Last edited:
  • Like
Likes 1 person
  • #5
Thanks Gridvvk. What you said in the [Edit] section does provide some solace :). I guess "proof of second derivative test" is what I should Google for?

It's interesting what you said that [itex]f_{xy} = f_{yx}[/itex] for "most" conditions. Always presumed it to be the case for "all" conditions. Can you provide some brief details as to when it is not the case?

Thank again.
 
  • #6
Phoeniyx said:
Thanks Gridvvk. What you said in the [Edit] section does provide some solace :). I guess "proof of second derivative test" is what I should Google for?

It's interesting what you said that [itex]f_{xy} = f_{yx}[/itex] for "most" conditions. Always presumed it to be the case for "all" conditions. Can you provide some brief details as to when it is not the case?

Thank again.

Wiki provides enough information on ##f_{xy} = f_{yx}##: http://en.wikipedia.org/wiki/Symmetry_of_second_derivatives
 
  • Like
Likes 1 person
  • #7
Excellent. Thank you. Wasn't sure what to search for.
 
  • #8
Anyway, the trick is to approximate your function with a polynomial function. Let's do this in the one-variable case first.

Let ##f:\mathbb{R}\rightarrow \mathbb{R}## be a "nice" enough function. Assume that at ##0##, we have a critical point (meaning the first derivative is ##0##). By Taylor's approximation, we can write

[tex]f(x) \sim f(0) + f^\prime(0) x + \frac{f^{\prime\prime}(0)}{2}x^2 = f(0) + \frac{f^{\prime\prime}(0)}{2}x^2[/tex]

So the function ##f## can be reasonably well approximated by a parabola at ##0##. Now, whether there is a local minimum/maximum/saddle point can also be read from this form.

If ##f^{\prime\prime}(0)>0##, then we approximate our function ##f## at ##0## by a polynomial of the form ##\alpha x^2 + \gamma##, where ##\alpha >0##. We know that these polynomials have a minimum at ##0##. So ##f## has a (local) minimum too.

Likewise, if ##f^{\prime\prime}(0) < 0##, then we have a local maximum.
Now, if ##f^{\prime\prime}(0) = 0##, then we approximate ##f## by a constant function. This is not a quadratic function, so there is not enough information here. So this case is inconclusive.

Now, in the ##2D##-case, the Taylor polynomial at ##(0,0)## has the form

[tex]f(x,y) \sim f(0,0) + \frac{f_{xx}(0,0)}{2} x^2+ f_{xy}(0,0) xy + \frac{f_{yy}(0,0)}{2} y^2[/tex]

So we simply need to study the behavior of functions of the form ##\alpha x^2 + \beta xy + \gamma y^2## in ##(0,0)##.

The case ##4\alpha \gamma-\beta^2<0## corresponds to a hyperbolic paraboloid:
images?q=tbn:ANd9GcRWKg3SfKUVJzVMIg2zHEOGHPvvNtQRW8mJM6-pUT6cS-F4-OOx.jpg

If your function looks like that, then you got a saddle point.

The case ##4\alpha\gamma-\beta^2>0## corresponds to a elliptic paraboloid:
512px-Paraboloid_of_Revolution.svg.png

whether the top is up or down is determined by the sign of ##\alpha##. So you get either a maximum or minimum, depending on ##\alpha##.
If ##\beta^2 - 4\alpha\gamma=0##, then you have a parabolic cylinder:
paracyl.gif

So the image is reached at a line. But we cannot draw any information here since higher order perturbations of ##f## (we only cared up to second order here) might still influence the graph.

This is not a proof though, but it should make clear where the test comes from.
 
  • Like
Likes 1 person
  • #9
Thanks R136a1. That was very helpful. Appreciate you taking the time for a detailed explanation.
 

1. What is the purpose of the second derivative test and the Hessian matrix?

The second derivative test and the Hessian matrix are mathematical tools used to determine the behavior of a function at a critical point. They help us determine whether a critical point is a local minimum, local maximum, or a saddle point.

2. How is the Hessian matrix calculated?

The Hessian matrix is calculated by taking the second partial derivatives of a multivariable function and arranging them in a matrix form. The Hessian matrix is a square matrix with the same number of rows and columns as the number of variables in the function.

3. What information does the Hessian matrix provide about a function?

The Hessian matrix provides information about the concavity and curvature of a function at a critical point. This information is used to determine whether the critical point is a local minimum, local maximum, or a saddle point.

4. How is the second derivative test used to determine the nature of a critical point?

The second derivative test uses the Hessian matrix to determine the nature of a critical point. If the Hessian matrix is positive definite, the critical point is a local minimum. If the Hessian matrix is negative definite, the critical point is a local maximum. If the Hessian matrix has both positive and negative eigenvalues, the critical point is a saddle point.

5. Can the second derivative test and the Hessian matrix be used for all types of functions?

No, the second derivative test and the Hessian matrix are only applicable for functions that have continuous second derivatives. If a function does not have continuous second derivatives, the second derivative test cannot be used to determine the nature of a critical point.

Similar threads

Replies
7
Views
1K
  • Calculus
Replies
3
Views
5K
  • Calculus and Beyond Homework Help
Replies
1
Views
463
  • Calculus
Replies
1
Views
1K
  • Calculus
Replies
5
Views
1K
  • Calculus and Beyond Homework Help
Replies
5
Views
548
  • Topology and Analysis
Replies
4
Views
757
Replies
3
Views
496
Replies
1
Views
1K
Back
Top