Total Derivatives of fxns from R^n to R^m

brydustin · Apr 22, 2012

There is already an article on physics forums that kind of addresses my issue:
https://www.physicsforums.com/showthread.php?t=107516 and I'm not really satisfied with the wikipedia article.

I am generally confused on what the derivative should be. I'm familiar with Jacobian matrices but am confused by Spivak's definition of a derivative because there is a norm in R^m for the numerator of the limit definition and a norm from R^n in the denominator, so it seems as if the derivative must return a scalar -- unless I am mistaken about the meaning of the norm in this case.
Spivak's definition:
lim h -->0 |f(a+h) - f(a) - λ(h)|/|h| = 0

I'm not really sure if the zeros are vectors or scalars (for example the norm equal to zero?)

On a side note, I'm trying to apply this to solve the problem:
Prove that f:R->R^2 is differentiable at a in R iff f_1 and f_2 are, and then it will follow that
f ' (a) = [ f_1 ' (a) , f_2 ' (a) ]^t

The result seems obvious but having a clear understanding of the definitions would help me prove something. Any help appreciated, thanks

slider142 · Apr 22, 2012

The derivative is the linear transformation λ in the numerator. It is not the fraction, which is indeed a scalar that approaches 0. In the definition above, you may read it as "The derivative of f at a is the unique linear transformation λ such that the limit as h approaches the 0 vector of ||f(a + h) - f(a) - λ(h)||/||h|| is 0."

brydustin · Apr 22, 2012

that makes a lot more sense... so it can be a vector value or a matrix depending on the values of "n" and "m"? So if n=1 and m>1 its a row vector? And n>1 and m=1 then its column vector? n=m=1 then its a scalar and it n>1,m>1 then its a matrix?

HallsofIvy · Apr 22, 2012

Strictly speaking, even in the "Calculus I" sense of a function from R¹ to R¹ the "derivative" is a linear transformation- it is "[itex]y= mx[/itex]" where m is the "usual" derivative evaluated at that point. Of course, we can represent that transformation by the coefficient df/dx= m.

For a function from R¹ to R³, the "vector valued function of a single real variable", f(t)=<f(t), g(t), h(t)>, the derivative is the linear transformation that maps x to <(df/dt)x, (dg/dt)x, (dh/dt)x> which is just the scalar product x<df/dt, dg/dt, dh/dt>. We typically "represent" that linear transformation as the vector <df/dt, dg/dt, dh/dt>.

For a function, f(x,y,z), from R³ to R¹, the "scalar valued function of three variables", the derivative is the linear transformation that maps <a, b, c> to [itex](\partial f/\partial x)a+ (\partial f/\partial y)b+ (\partial f/\partial z)c[/itex], the dot product [itex]<\partial f/\partial x, \partial f/\partial y, \partial f/\partial z>\cdot<a, b, c>[/itex] which we can represent by [itex]\nabla f= <\partial f/\partial x, \partial f/\partial y, \partial f/\partial z>[/itex].

Finally, for a function of three variables, which returns a 3-vector, [itex]<f(x, y, z), g(x, y, z), h(x, y, z)>[/itex] is the linear transformation that can be represented by the 3 by 3 matrix
[tex]\begin{bmatrix}\frac{\partial f}{\partial x} & \frac{\partial f}{\partial y} & \frac{\partial f}{\partial z} \\ \frac{\partial g}{\partial x} & \frac{\partial g}{\partial y} & \frac{\partial g}{\partial z} \\ \frac{\partial h}{\partial x} & \frac{\partial h}{\partial y} & \frac{\partial h}{\partial z}\end{bmatrix}[/tex]

slider142 · Apr 22, 2012

It is a linear transformation. If you choose a basis for R^n and R^m, then you can determine the image of the basis of R^n under the transformation λ. Due to linearity, if we write e_i as the ith basis vector of R^n in some coordinate system and x^i as the ith component of a vector x in R^n, we can then write
[tex]\lambda(x) = \lambda(x^1e_1 + \cdots + x^ne_n) = x^1\lambda(e_1) + \cdots + x^n\lambda(e_n)[/tex]
The latter sum can be written as a matrix multiplication where λ(e_i) is the ith column of the matrix. However, this is relative to a coordinate system for R^n and R^m. Change those coordinate systems, and the matrix representation of λ will be different. Hence, it is usually preferable to work with λ as a linear transformation when doing theoretical work, with a matrix representation in a given coordinate system.
Halls gives some examples of the differing representations of λ in Cartesian coordinate systems with the standard basis above.

Total Derivatives of fxns from R^n to R^m

Discussion Overview

Discussion Character

Main Points Raised

Areas of Agreement / Disagreement

Contextual Notes

Similar threads

Undergrad Finding the minimum distance between two curves

Undergrad Why ##a^0=1##?

High School Straightforward integration…

High School Arc Length for Hyperbolic Sin

Undergrad Ambiguity of the term "indefinite integral"

Insights Revisiting the Velocity-Time Function

Insights Remote Operated Gate Control System

Insights AI Enriched Problem Solving

Insights Thinking Outside The Box Versus Knowing What’s In The Box

Insights Why Entangled Photon-Polarization Qubits Violate Bell’s Inequality

Insights Quantum Entanglement is a Kinematic Fact, not a Dynamical Effect