Dismiss Notice
Join Physics Forums Today!
The friendliest, high quality science and math community on the planet! Everyone who loves science is here!

Visualizing second derivative test - Hessian

  1. Dec 1, 2013 #1
    Hey guys. I am having some trouble visualizing one aspect of the Second derivative test in the 2 variable case (related to #3 below). Essentially, what does the curve look like when [itex]f_{xx}f_{yy} > 0[/itex], BUT [itex]f_{xx}f_{yy} < [f_{xy}]^{2}[/itex]?

    To be more detailed, if the function is f(x,y), H(x,y) is the Hessian matrix of f and D is the determinant of H, where [itex]D = Det(H(x,y)) = f_{xx}f_{yy} - [f_{xy}]^{2} [/itex]

    1) If D(a, b) > 0, and [itex]f_(xx)(a,b) > 0[/itex] => local minimum
    2) If D(a, b) > 0, and [itex]f_(xx)(a,b) < 0[/itex] => local maximum
    3) If D(a, b) < 0 => saddle point

    I can totally see why [itex]f_{xx}[/itex] and [itex]f_{yy}[/itex] must have the same sign for there to be a max or a minimum - but I DON'T see why the product has to be "greater" than the square of [itex]f_{xy}[/itex] (as opposed to just 0) to have a max or min.

    Thanks guys. Much appreciated.
  2. jcsd
  3. Dec 1, 2013 #2
    It's the sign of D that matters not fxx,fyy, or fxy individually.

    If D < 0 that is fxxfyy < (fxy)^2 you have a saddle point.

    For an example, consdier f(x,y) = x^2 - y^2, a hyperbolic paraboloid (or more informally a saddle). Think of it as a pringle chip.

    Find the critical points

    fx = 2x = 0
    fy = -2y = 0
    fxx = 2, fyy = -2, fxy = 0. D = -4 < 0.

    So (0,0) is the critical point. If analyze the behavior fx is concave up and fy is concave down, so it can't be either a local maximum or minimum, so it has to be a saddle point. Having D < 0 gives you a local min in some directions and a local max in others, so we call it a saddle point.

    If you want a more thorough understanding just search up a proof for the second derivatives test and in the D<0 it's always some form of local mins in some directions and local maxes in others.
  4. Dec 1, 2013 #3
    Thanks. But what I meant was what happens when [itex]f_{xx}[/itex] and [itex]f_{yy}[/itex] have the same sign - but their product is less than [itex][f_{xy}]^2[/itex]. So, your example would not work since 2 and -2 have opposite signs.

    Essentially, what I am asking is, what does it mean for a curve to have [itex]f_{xx} = 2[/itex], [itex]f_{yy} = 1[/itex], and [itex][f_{xy}]^2 = 4[/itex]. Or something similar. i.e. all of them have the same signs, but [itex]D = -2 < 0[/itex].

    Thanks again.
  5. Dec 1, 2013 #4
    The same thing happens. Since D<0 you have a saddle point, in some directions you have local maxes and in others you have local mins. That is when D<0 you don't even bother to check the sign of fxx because D<0 => saddle point. The example I chose was meant to be illustrative of where the name "saddle" is coming from.

    What it means is if at the critical point (a,b), D(a,b) < 0, then in some directions you will points with values larger than (a,b) in others you will have points with values smaller than (a,b) so (a,b) can't be a relative extrema (which is visualized by thinking of a saddle), so you can't call it a relative max or min, so we call it a saddle point.

    Edit: Seem's like your contention is with the structure of the Hessian Determinant. You are questioning the relevance of fxy. In calculus of single variables you checked if something was a local max or min by taking the second derivative and f''() < 0 told you local max and f''() > 0 told you local min, in multivariable calculus you have second order partial derivatives, and there are four possibilities fxx, fyy , fxy, and fyx, but by Clauriout's under most conditions fxy = fyx. That may provide some solace to why fxy plays a role here.

    Why can you conclude D < 0 implies in certain directions you are increasing and in others you are decreasing, comes back to the proof and where the D is coming from. But the statement in itself means exactly that when D(a,b) < 0 you are both increasing and decreasing in a sense.

    Also: The last paragraph of this section addresses the geometric interpretation of the case when fxxfyy < (fxy)^2.

    Last edited: Dec 1, 2013
  6. Dec 1, 2013 #5
    Thanks Gridvvk. What you said in the [Edit] section does provide some solace :). I guess "proof of second derivative test" is what I should Google for?

    It's interesting what you said that [itex]f_{xy} = f_{yx}[/itex] for "most" conditions. Always presumed it to be the case for "all" conditions. Can you provide some brief details as to when it is not the case?

    Thank again.
  7. Dec 1, 2013 #6
    Wiki provides enough information on ##f_{xy} = f_{yx}##: http://en.wikipedia.org/wiki/Symmetry_of_second_derivatives
  8. Dec 1, 2013 #7
    Excellent. Thank you. Wasn't sure what to search for.
  9. Dec 1, 2013 #8
    Anyway, the trick is to approximate your function with a polynomial function. Let's do this in the one-variable case first.

    Let ##f:\mathbb{R}\rightarrow \mathbb{R}## be a "nice" enough function. Assume that at ##0##, we have a critical point (meaning the first derivative is ##0##). By Taylor's approximation, we can write

    [tex]f(x) \sim f(0) + f^\prime(0) x + \frac{f^{\prime\prime}(0)}{2}x^2 = f(0) + \frac{f^{\prime\prime}(0)}{2}x^2[/tex]

    So the function ##f## can be reasonably well approximated by a parabola at ##0##. Now, whether there is a local minimum/maximum/saddle point can also be read from this form.

    If ##f^{\prime\prime}(0)>0##, then we approximate our function ##f## at ##0## by a polynomial of the form ##\alpha x^2 + \gamma##, where ##\alpha >0##. We know that these polynomials have a minimum at ##0##. So ##f## has a (local) minimum too.

    Likewise, if ##f^{\prime\prime}(0) < 0##, then we have a local maximum.
    Now, if ##f^{\prime\prime}(0) = 0##, then we approximate ##f## by a constant function. This is not a quadratic function, so there is not enough information here. So this case is inconclusive.

    Now, in the ##2D##-case, the Taylor polynomial at ##(0,0)## has the form

    [tex]f(x,y) \sim f(0,0) + \frac{f_{xx}(0,0)}{2} x^2+ f_{xy}(0,0) xy + \frac{f_{yy}(0,0)}{2} y^2[/tex]

    So we simply need to study the behavior of functions of the form ##\alpha x^2 + \beta xy + \gamma y^2## in ##(0,0)##.

    The case ##4\alpha \gamma-\beta^2<0## corresponds to a hyperbolic paraboloid:
    If your function looks like that, then you got a saddle point.

    The case ##4\alpha\gamma-\beta^2>0## corresponds to a elliptic paraboloid:
    whether the top is up or down is determined by the sign of ##\alpha##. So you get either a maximum or minimum, depending on ##\alpha##.
    If ##\beta^2 - 4\alpha\gamma=0##, then you have a parabolic cylinder:
    So the image is reached at a line. But we cannot draw any information here since higher order perturbations of ##f## (we only cared up to second order here) might still influence the graph.

    This is not a proof though, but it should make clear where the test comes from.
  10. Dec 2, 2013 #9
    Thanks R136a1. That was very helpful. Appreciate you taking the time for a detailed explanation.
Share this great discussion with others via Reddit, Google+, Twitter, or Facebook