# Total Derivatives of fxns from R^n to R^m

1. Apr 22, 2012

### brydustin

There is already an article on physics forums that kind of addresses my issue:
https://www.physicsforums.com/showthread.php?t=107516 and i'm not really satisfied with the wikipedia article.

I am generally confused on what the derivative should be. I'm familiar with Jacobian matrices but am confused by Spivak's definition of a derivative because there is a norm in R^m for the numerator of the limit definition and a norm from R^n in the denominator, so it seems as if the derivative must return a scalar -- unless I am mistaken about the meaning of the norm in this case.
Spivak's definition:
lim h -->0 |f(a+h) - f(a) - λ(h)|/|h| = 0

I'm not really sure if the zeros are vectors or scalars (for example the norm equal to zero?)

On a side note, I'm trying to apply this to solve the problem:
Prove that f:R->R^2 is differentiable at a in R iff f_1 and f_2 are, and then it will follow that
f ' (a) = [ f_1 ' (a) , f_2 ' (a) ]^t

The result seems obvious but having a clear understanding of the definitions would help me prove something. Any help appreciated, thanks

2. Apr 22, 2012

### slider142

The derivative is the linear transformation λ in the numerator. It is not the fraction, which is indeed a scalar that approaches 0. In the definition above, you may read it as "The derivative of f at a is the unique linear transformation λ such that the limit as h approaches the 0 vector of ||f(a + h) - f(a) - λ(h)||/||h|| is 0."

3. Apr 22, 2012

### brydustin

that makes a lot more sense..... so it can be a vector value or a matrix depending on the values of "n" and "m"? So if n=1 and m>1 its a row vector? And n>1 and m=1 then its column vector? n=m=1 then its a scalar and it n>1,m>1 then its a matrix?

4. Apr 22, 2012

### HallsofIvy

Staff Emeritus
Strictly speaking, even in the "Calculus I" sense of a function from R1 to R1 the "derivative" is a linear transformation- it is "$y= mx$" where m is the "usual" derivative evaluated at that point. Of course, we can represent that transformation by the coefficient df/dx= m.

For a function from R1 to R3, the "vector valued function of a single real variable", f(t)=<f(t), g(t), h(t)>, the derivative is the linear transformation that maps x to <(df/dt)x, (dg/dt)x, (dh/dt)x> which is just the scalar product x<df/dt, dg/dt, dh/dt>. We typically "represent" that linear transformation as the vector <df/dt, dg/dt, dh/dt>.

For a function, f(x,y,z), from R3 to R1, the "scalar valued function of three variables", the derivative is the linear transformation that maps <a, b, c> to $(\partial f/\partial x)a+ (\partial f/\partial y)b+ (\partial f/\partial z)c$, the dot product $<\partial f/\partial x, \partial f/\partial y, \partial f/\partial z>\cdot<a, b, c>$ which we can represent by $\nabla f= <\partial f/\partial x, \partial f/\partial y, \partial f/\partial z>$.

Finally, for a function of three variables, which returns a 3-vector, $<f(x, y, z), g(x, y, z), h(x, y, z)>$ is the linear transformation that can be represented by the 3 by 3 matrix
$$\begin{bmatrix}\frac{\partial f}{\partial x} & \frac{\partial f}{\partial y} & \frac{\partial f}{\partial z} \\ \frac{\partial g}{\partial x} & \frac{\partial g}{\partial y} & \frac{\partial g}{\partial z} \\ \frac{\partial h}{\partial x} & \frac{\partial h}{\partial y} & \frac{\partial h}{\partial z}\end{bmatrix}$$

Last edited: Apr 22, 2012
5. Apr 22, 2012

### slider142

It is a linear transformation. If you choose a basis for R^n and R^m, then you can determine the image of the basis of R^n under the transformation λ. Due to linearity, if we write e_i as the ith basis vector of R^n in some coordinate system and x^i as the ith component of a vector x in R^n, we can then write
$$\lambda(x) = \lambda(x^1e_1 + \cdots + x^ne_n) = x^1\lambda(e_1) + \cdots + x^n\lambda(e_n)$$
The latter sum can be written as a matrix multiplication where λ(e_i) is the ith column of the matrix. However, this is relative to a coordinate system for R^n and R^m. Change those coordinate systems, and the matrix representation of λ will be different. Hence, it is usually preferable to work with λ as a linear transformation when doing theoretical work, with a matrix representation in a given coordinate system.
Halls gives some examples of the differing representations of λ in Cartesian coordinate systems with the standard basis above.