Total Derivatives of fxns from R^n to R^m

  • Context: Graduate 
  • Thread starter Thread starter brydustin
  • Start date Start date
  • Tags Tags
    Derivatives
Click For Summary

Discussion Overview

The discussion revolves around the concept of total derivatives for functions mapping from R^n to R^m, focusing on definitions, interpretations, and applications of derivatives in various contexts. Participants express confusion regarding the definitions and implications of derivatives, particularly in relation to Jacobian matrices and linear transformations.

Discussion Character

  • Exploratory
  • Technical explanation
  • Conceptual clarification
  • Debate/contested

Main Points Raised

  • One participant expresses confusion about Spivak's definition of a derivative, questioning the meaning of norms in the limit definition and whether the zeros involved are vectors or scalars.
  • Another participant clarifies that the derivative is a linear transformation and that the limit approaches a scalar value of zero, emphasizing the role of the linear transformation in the definition.
  • A participant inquires whether the derivative can be represented as a vector or matrix depending on the dimensions of n and m, suggesting specific cases for different values of n and m.
  • One participant discusses the representation of derivatives as linear transformations in various contexts, including functions from R^1 to R^1 and R^3, and how these can be expressed in terms of matrices.
  • Another participant elaborates on the concept of linear transformations and their representation in different coordinate systems, noting that the matrix representation of a linear transformation can vary with the choice of basis.

Areas of Agreement / Disagreement

Participants generally agree on the notion of derivatives as linear transformations, but there remains some uncertainty and differing interpretations regarding the definitions and representations of these transformations in various contexts.

Contextual Notes

Some limitations include the dependence on definitions of norms and linear transformations, as well as the potential variability in matrix representations based on coordinate systems. The discussion does not resolve these complexities.

brydustin
Messages
201
Reaction score
0
There is already an article on physics forums that kind of addresses my issue:
https://www.physicsforums.com/showthread.php?t=107516 and I'm not really satisfied with the wikipedia article.

I am generally confused on what the derivative should be. I'm familiar with Jacobian matrices but am confused by Spivak's definition of a derivative because there is a norm in R^m for the numerator of the limit definition and a norm from R^n in the denominator, so it seems as if the derivative must return a scalar -- unless I am mistaken about the meaning of the norm in this case.
Spivak's definition:
lim h -->0 |f(a+h) - f(a) - λ(h)|/|h| = 0

I'm not really sure if the zeros are vectors or scalars (for example the norm equal to zero?)

On a side note, I'm trying to apply this to solve the problem:
Prove that f:R->R^2 is differentiable at a in R iff f_1 and f_2 are, and then it will follow that
f ' (a) = [ f_1 ' (a) , f_2 ' (a) ]^t

The result seems obvious but having a clear understanding of the definitions would help me prove something. Any help appreciated, thanks
 
Physics news on Phys.org
The derivative is the linear transformation λ in the numerator. It is not the fraction, which is indeed a scalar that approaches 0. In the definition above, you may read it as "The derivative of f at a is the unique linear transformation λ such that the limit as h approaches the 0 vector of ||f(a + h) - f(a) - λ(h)||/||h|| is 0."
 
that makes a lot more sense... so it can be a vector value or a matrix depending on the values of "n" and "m"? So if n=1 and m>1 its a row vector? And n>1 and m=1 then its column vector? n=m=1 then its a scalar and it n>1,m>1 then its a matrix?
 
Strictly speaking, even in the "Calculus I" sense of a function from R1 to R1 the "derivative" is a linear transformation- it is "y= mx" where m is the "usual" derivative evaluated at that point. Of course, we can represent that transformation by the coefficient df/dx= m.

For a function from R1 to R3, the "vector valued function of a single real variable", f(t)=<f(t), g(t), h(t)>, the derivative is the linear transformation that maps x to <(df/dt)x, (dg/dt)x, (dh/dt)x> which is just the scalar product x<df/dt, dg/dt, dh/dt>. We typically "represent" that linear transformation as the vector <df/dt, dg/dt, dh/dt>.

For a function, f(x,y,z), from R3 to R1, the "scalar valued function of three variables", the derivative is the linear transformation that maps <a, b, c> to (\partial f/\partial x)a+ (\partial f/\partial y)b+ (\partial f/\partial z)c, the dot product &lt;\partial f/\partial x, \partial f/\partial y, \partial f/\partial z&gt;\cdot&lt;a, b, c&gt; which we can represent by \nabla f= &lt;\partial f/\partial x, \partial f/\partial y, \partial f/\partial z&gt;.

Finally, for a function of three variables, which returns a 3-vector, &lt;f(x, y, z), g(x, y, z), h(x, y, z)&gt; is the linear transformation that can be represented by the 3 by 3 matrix
\begin{bmatrix}\frac{\partial f}{\partial x} &amp; \frac{\partial f}{\partial y} &amp; \frac{\partial f}{\partial z} \\ \frac{\partial g}{\partial x} &amp; \frac{\partial g}{\partial y} &amp; \frac{\partial g}{\partial z} \\ \frac{\partial h}{\partial x} &amp; \frac{\partial h}{\partial y} &amp; \frac{\partial h}{\partial z}\end{bmatrix}
 
Last edited by a moderator:
It is a linear transformation. If you choose a basis for R^n and R^m, then you can determine the image of the basis of R^n under the transformation λ. Due to linearity, if we write e_i as the ith basis vector of R^n in some coordinate system and x^i as the ith component of a vector x in R^n, we can then write
\lambda(x) = \lambda(x^1e_1 + \cdots + x^ne_n) = x^1\lambda(e_1) + \cdots + x^n\lambda(e_n)
The latter sum can be written as a matrix multiplication where λ(e_i) is the ith column of the matrix. However, this is relative to a coordinate system for R^n and R^m. Change those coordinate systems, and the matrix representation of λ will be different. Hence, it is usually preferable to work with λ as a linear transformation when doing theoretical work, with a matrix representation in a given coordinate system.
Halls gives some examples of the differing representations of λ in Cartesian coordinate systems with the standard basis above.
 

Similar threads

  • · Replies 3 ·
Replies
3
Views
4K
  • · Replies 4 ·
Replies
4
Views
3K
  • · Replies 1 ·
Replies
1
Views
2K
  • · Replies 5 ·
Replies
5
Views
3K
  • · Replies 3 ·
Replies
3
Views
3K
  • · Replies 3 ·
Replies
3
Views
4K
  • · Replies 4 ·
Replies
4
Views
2K
  • · Replies 1 ·
Replies
1
Views
1K
  • · Replies 2 ·
Replies
2
Views
2K
  • · Replies 3 ·
Replies
3
Views
3K