Dismiss Notice
Join Physics Forums Today!
The friendliest, high quality science and math community on the planet! Everyone who loves science is here!

Definition of derivative

  1. Aug 13, 2005 #1
    Saw this in the thread about "Explaining topology..."

    Is there an easy way to extend this particular definition to when f : R^n -> R^m?
     
    Last edited: Aug 13, 2005
  2. jcsd
  3. Aug 13, 2005 #2

    Hurkyl

    User Avatar
    Staff Emeritus
    Science Advisor
    Gold Member

    A function whose range is R^m is simply m functions whose range is R.
     
  4. Aug 13, 2005 #3
    Yes, that makes sense. What I'm getting at is the term proportional to h at the end. If h is a vector in R^n and f(h) is a vector in R^m, how do you make that particular definition work?
     
  5. Aug 13, 2005 #4

    Hurkyl

    User Avatar
    Staff Emeritus
    Science Advisor
    Gold Member

    Bleh, I wasn't paying any attention at all. I'm embarassed to have written my previous post!

    That definition already works for f : R^n → R^m.


    Though, I guess it's much less complicated to write R(h) instead of h o(|h|), and require that |R(h)| / |h| → 0. (R for remainder)
     
    Last edited: Aug 13, 2005
  6. Aug 13, 2005 #5
    I could live with R(h) as the remainder if it is a vector in R^m. But it seems that h o(|h|) is a vector in R^n, and you can't add that to two vectors in R^m, can you?
     
  7. Aug 13, 2005 #6

    Hurkyl

    User Avatar
    Staff Emeritus
    Science Advisor
    Gold Member

    You're assuming the thing in the o(|h|) is a number: it could be, for example, an operator R^n → R^m whose operator norm is asymptotically less than |h|.
     
  8. Aug 13, 2005 #7
    You're right, I was assuming that. I know it doesn't really matter what the order is, but it might be more suggestive to write it o(|h|)h or something like that, to make it more clear that o(|h|) is a linear map. Thanks for the help :smile:
     
  9. Aug 13, 2005 #8

    Hurkyl

    User Avatar
    Staff Emeritus
    Science Advisor
    Gold Member

    It can't be a linear map. (Unless it's the zero operator)
     
  10. Aug 13, 2005 #9
    Ah, okay. Can you explain?

    edit: I guess it's because, if it were linear, you could always add more vectors to the sum so that

    o(|h|)(Sum of vectors) > |h| as h->0?
     
    Last edited: Aug 13, 2005
  11. Aug 13, 2005 #10

    Hurkyl

    User Avatar
    Staff Emeritus
    Science Advisor
    Gold Member

    Well, if o(|h|) was hiding a linear map, then the remainder would look like:

    R(h) = A h

    for some matrix A...
     
  12. Aug 13, 2005 #11
    Right. I guess what I had in mind was a linear map parametrized by |h|, if that makes any sense at all. But that wouldn't be a linear map since it operates differently on different choices of h. Thanks.
     
  13. Aug 14, 2005 #12

    HallsofIvy

    User Avatar
    Staff Emeritus
    Science Advisor

    Of course, before you can define "derivative" of a function from Rn to Rm, you have to define "differentiable" (that's different from calculus I where a function is "differentiable" as long as the derivative exists!).

    If f(x) is a function from Rn to Rm, the f is differentiable at x= a if and only if there exist a linear function, L, from Rn to Rm, and a function ε(x), from Rn to Rm, such that

    f(x)= f(a)+ L(x-a)+ ε(x) and [tex]lim_{|x-a|->0}\frac{\epsilon}{|x-a|}= 0[/tex].

    If that is true, then it is easy to show that the linear function, L, is unique (ε is not). We define the "derivative of f at a" to be that linear function, L.

    Notice that, by this definition, in the case f:R1->R1, the derivative of f at a is a linear function from R['sup]1[/sup]->R1, not a number! However, any such linear function must be of the form L(x)= ax- multiplication by a number. That number is, of course, the "Calculus I" derivative of f.

    Similarly, the derivative of a "vector valued function of a real variable", R1->Rm, is a linear function from R1 to Rm. Any such function can be written L(x)= x<a1, ....,am>, or x times a vector. That vector is the vector of derivatives in the usual "calculus III" sense.

    The derivative of a "real valued function of several real variables", Rn->R1, is a linear function from Rn to R1. Such a function can be written as a dot product: <a1,...,an> dot product the x-vector. That vector is precisely the "gradient vector" of f. (And recall that, in Calculus III, a function may have partial derivatives at a point but not be "differentiable" there.)

    This is, by the way, where the "second derivative test" for max or min (or saddle point) of a function of two variables comes from: You look at [itex]\frac{\partial^2F}{\partial x^2}\frac{\partial^2F}{\partial y^2}- \(\frac{\partial^2F}{\partial x \partial y}\)^2[/itex]. If that is negative at a (where the partial derivatives are 0), then there is a saddle point at a. If that is positive, then you have either a max or min depending on the sign of the second partials (which must be the same).

    The point is that, if F:R2->R, then its derivative, at each point, can be represented as a 2-vector (the gradient vector). That means that the derivative function, that to each point assigns to that point the derivative vector, is a function from R2 to R2- and its derivative is a linear transformation from R2 to R2- which can be represented by a 2 by 2 matrix at each point (the "Hessian" matrix). The calculation [itex]\frac{\partial^2F}{\partial x^2}\frac{\partial^2F}{\partial y^2}- \(\frac{\partial^2F}{\partial x\partial y}\)^2[/itex] is simply the determinant of that matrix. Since the mixed second derivatives are equal, that matrix is symmetric and can, by a coordinate change, be written as a diagonal matrix having the eigenvalues on the diagonal. In that coordinate system, the equation for F is just ax2+ b2= C (no xy term) so if a and b are both positive we have a minimum, if both positive a maximum, if one positive, the other negative, a saddle point. Of course, the determinant (which does not change with a change of coordinate system) is just the product ab.
     
    Last edited: Aug 14, 2005
  14. Aug 14, 2005 #13

    mathwonk

    User Avatar
    Science Advisor
    Homework Helper
    2015 Award

    to define a derivative you first define what it means for a map to have derivative equal to zero. a map o(h) has derivative equal to zero at h =0 if and only if |o(h)|/|h| approaches zero as h does.

    this makes sense for vector valued maps if |h| is a norm.

    then the original definition makes sense (corrected) if we say that f is differentiable at x provided there exists a linear map L(h) such that the difference

    f(x+h) - L(h) - f(x) has derivative equal to zero at h = 0.

    then Df(x) = L.

    in the original definition there is possibly an error, since the term h.o(h) should have been merely o(h) in this sense.
     
  15. Aug 14, 2005 #14

    Hurkyl

    User Avatar
    Staff Emeritus
    Science Advisor
    Gold Member

    So what precisely does little-oh notation mean when applied to a vector-valued function? I've only ever really used it in the context of algorithmic analysis, and time complexity isn't vector valued. :biggrin:
     
  16. Aug 14, 2005 #15

    mathwonk

    User Avatar
    Science Advisor
    Homework Helper
    2015 Award

    exactly what i said: |o(h)|/(h| approaches zero ( a number), as |h| does, or equivalently as h does (a vector).
     
  17. Aug 14, 2005 #16

    Hurkyl

    User Avatar
    Staff Emeritus
    Science Advisor
    Gold Member

    Well, |o(h)| / |h| can't approach a number because, technically, o(h) is a set. :tongue2:

    I guess this is what I assumed, so it's good to hear confirmation:

    [tex]
    f \in o(g) \Leftrightarrow \lim \frac{|f|}{|g|} = 0
    [/tex]
     
  18. Aug 14, 2005 #17

    mathwonk

    User Avatar
    Science Advisor
    Homework Helper
    2015 Award

    ??????

    o(h) is a vector valued function of h, which is also a vector.

    oh i see, my legg is being pulled..
     
  19. Aug 14, 2005 #18

    mathwonk

    User Avatar
    Science Advisor
    Homework Helper
    2015 Award

    ??????

    o(h) is a vector valued function of h, which is also a vector.

    oh i see, my legg is being pulled.. this notation was used this way by hjardy and also by loomis, among others.
     
  20. Aug 14, 2005 #19

    Hurkyl

    User Avatar
    Staff Emeritus
    Science Advisor
    Gold Member

    !

    In computer science, the notation o(f) is the class of all functions that are asymptotically less than f. Specifically:

    [tex]
    o(f) := \left\{ g \, | \,
    \lim_{x \rightarrow \infty} \frac{g(x)}{f(x)} = 0
    \right\}
    [/tex]

    (P.S. how do I get a nice, tall [itex]|[/itex]?)

    I understand that here we are looking at x &rarr; 0 instead, but still...


    If I were to say something like:

    f(x) = x ln x + o(x)

    what I really mean is that there exists some function g in o(x) such that

    f(x) = x ln x + g(x)
     
    Last edited: Aug 14, 2005
  21. Aug 14, 2005 #20
    Use: \left| and \right| .................. [tex] \left| \frac{\mbox{Big}}{\mbox{Expression}} \right| [/tex]
     
Know someone interested in this topic? Share this thread via Reddit, Google+, Twitter, or Facebook

Have something to add?