Dismiss Notice
Join Physics Forums Today!
The friendliest, high quality science and math community on the planet! Everyone who loves science is here!

Total Derivatives of fxns from R^n to R^m

  1. Apr 22, 2012 #1
    There is already an article on physics forums that kind of addresses my issue:
    https://www.physicsforums.com/showthread.php?t=107516 and i'm not really satisfied with the wikipedia article.

    I am generally confused on what the derivative should be. I'm familiar with Jacobian matrices but am confused by Spivak's definition of a derivative because there is a norm in R^m for the numerator of the limit definition and a norm from R^n in the denominator, so it seems as if the derivative must return a scalar -- unless I am mistaken about the meaning of the norm in this case.
    Spivak's definition:
    lim h -->0 |f(a+h) - f(a) - λ(h)|/|h| = 0

    I'm not really sure if the zeros are vectors or scalars (for example the norm equal to zero?)

    On a side note, I'm trying to apply this to solve the problem:
    Prove that f:R->R^2 is differentiable at a in R iff f_1 and f_2 are, and then it will follow that
    f ' (a) = [ f_1 ' (a) , f_2 ' (a) ]^t

    The result seems obvious but having a clear understanding of the definitions would help me prove something. Any help appreciated, thanks
  2. jcsd
  3. Apr 22, 2012 #2
    The derivative is the linear transformation λ in the numerator. It is not the fraction, which is indeed a scalar that approaches 0. In the definition above, you may read it as "The derivative of f at a is the unique linear transformation λ such that the limit as h approaches the 0 vector of ||f(a + h) - f(a) - λ(h)||/||h|| is 0."
  4. Apr 22, 2012 #3
    that makes a lot more sense..... so it can be a vector value or a matrix depending on the values of "n" and "m"? So if n=1 and m>1 its a row vector? And n>1 and m=1 then its column vector? n=m=1 then its a scalar and it n>1,m>1 then its a matrix?
  5. Apr 22, 2012 #4


    User Avatar
    Science Advisor

    Strictly speaking, even in the "Calculus I" sense of a function from R1 to R1 the "derivative" is a linear transformation- it is "[itex]y= mx[/itex]" where m is the "usual" derivative evaluated at that point. Of course, we can represent that transformation by the coefficient df/dx= m.

    For a function from R1 to R3, the "vector valued function of a single real variable", f(t)=<f(t), g(t), h(t)>, the derivative is the linear transformation that maps x to <(df/dt)x, (dg/dt)x, (dh/dt)x> which is just the scalar product x<df/dt, dg/dt, dh/dt>. We typically "represent" that linear transformation as the vector <df/dt, dg/dt, dh/dt>.

    For a function, f(x,y,z), from R3 to R1, the "scalar valued function of three variables", the derivative is the linear transformation that maps <a, b, c> to [itex](\partial f/\partial x)a+ (\partial f/\partial y)b+ (\partial f/\partial z)c[/itex], the dot product [itex]<\partial f/\partial x, \partial f/\partial y, \partial f/\partial z>\cdot<a, b, c>[/itex] which we can represent by [itex]\nabla f= <\partial f/\partial x, \partial f/\partial y, \partial f/\partial z>[/itex].

    Finally, for a function of three variables, which returns a 3-vector, [itex]<f(x, y, z), g(x, y, z), h(x, y, z)>[/itex] is the linear transformation that can be represented by the 3 by 3 matrix
    [tex]\begin{bmatrix}\frac{\partial f}{\partial x} & \frac{\partial f}{\partial y} & \frac{\partial f}{\partial z} \\ \frac{\partial g}{\partial x} & \frac{\partial g}{\partial y} & \frac{\partial g}{\partial z} \\ \frac{\partial h}{\partial x} & \frac{\partial h}{\partial y} & \frac{\partial h}{\partial z}\end{bmatrix}[/tex]
    Last edited by a moderator: Apr 22, 2012
  6. Apr 22, 2012 #5
    It is a linear transformation. If you choose a basis for R^n and R^m, then you can determine the image of the basis of R^n under the transformation λ. Due to linearity, if we write e_i as the ith basis vector of R^n in some coordinate system and x^i as the ith component of a vector x in R^n, we can then write
    [tex]\lambda(x) = \lambda(x^1e_1 + \cdots + x^ne_n) = x^1\lambda(e_1) + \cdots + x^n\lambda(e_n)[/tex]
    The latter sum can be written as a matrix multiplication where λ(e_i) is the ith column of the matrix. However, this is relative to a coordinate system for R^n and R^m. Change those coordinate systems, and the matrix representation of λ will be different. Hence, it is usually preferable to work with λ as a linear transformation when doing theoretical work, with a matrix representation in a given coordinate system.
    Halls gives some examples of the differing representations of λ in Cartesian coordinate systems with the standard basis above.
Share this great discussion with others via Reddit, Google+, Twitter, or Facebook