Minimization of objective function

  1. Nov 21, 2015 #1
    I need to minimize, with respect to [itex]\hat{y}(x)[/itex], the following function:
    [tex]\tilde{J}_x = \mathbb{E}_{p(x,y)}[(\hat{y}(x)-y)^2] + \nu \mathbb{E}_{p(x,y)}[(\hat{y}(x)-y)tr(\nabla_x^2\hat{y}(x))] + \nu \mathbb{E}_{p(x,y)}[||\nabla_x\hat{y}(x)||^2],[/tex]
    where [itex]x[/itex] is a vector and [itex]y[/itex] a scalar.
    I found this in a book about Deep Learning (Machine Learning). I'm studying on my own and this math is a bit over my head. If you want more context, see pages 215-216 here: http://goodfeli.github.io/dlbook/contents/regularization.html [Broken]
    First of all, do I need to learn the Calculus of Variations to solve this?
    The expression I wrote here is slightly different from the one on the book, because I think the authors forgot a "trace" (tr).
    Thank you for your time.
  3. Nov 21, 2015 #2


    I think you're looking for the Lagrange multiplier


    EDIT: Sorry, misread the post, I thought you already wanted a solution within calculus. The answer to whether or not you'll have to learn it is not really, as you will have a machine do it for you anyway. So you don't have to understand why this solution works as long as you manage to code it once / get somebody else to do it.
  4. Nov 21, 2015 #3


    And please don't forget that the function doesn't neccesarily have to obtain a min/max value, unless you're working with a closed set.

    For example the function f(x) = x obtains no minima/maxima for x∈(0,1), although you can get "infinitely close" to both infimum and supremum (0 and 1). You may have to consider these cases separately, that really depends on what you're doing.

