Derivative of transformation of basis in R^n

Click For Summary

Discussion Overview

The discussion revolves around the derivative of the identity transformation in R^n, particularly focusing on the notation and interpretation of derivatives at specific points. Participants explore the implications of the implicit function theorem and the concept of differentiating transformations, with a specific interest in the identity function and its derivatives.

Discussion Character

  • Exploratory
  • Technical explanation
  • Conceptual clarification
  • Debate/contested

Main Points Raised

  • One participant expresses confusion about the notation D(id_n)_x, questioning the distinction between the identity function and its derivative at a point.
  • Another participant explains that the subscript indicates differentiation at a specific point, relating this to the definition of the derivative as an approximating linear map.
  • There is a discussion about the Jacobian in the context of the inverse function theorem, with one participant emphasizing the importance of a non-zero derivative for the existence of an inverse.
  • Some participants clarify that D(Id)_x equals Id, noting that the derivative of a linear map at any point is the linear map itself.
  • There is a debate about what the identity function looks like in one dimension, with differing views on whether it is simply x or 1.
  • One participant highlights the complexity of multi-dimensional derivatives and the various notations used, expressing frustration over the lack of consistency.

Areas of Agreement / Disagreement

Participants generally agree on the definition of the derivative and its evaluation at a point, but there remains uncertainty and differing interpretations regarding the identity function and its representation in different dimensions. The discussion does not reach a consensus on the exact nature of the identity function in the context of derivatives.

Contextual Notes

Participants note the limitations of notation and the potential for confusion when transitioning from one-dimensional to multi-dimensional derivatives. The discussion reflects the complexities involved in understanding linear operators and their derivatives in higher dimensions.

brydustin
Messages
201
Reaction score
0
I've just studied the implicit function theorem and if we assume the theorem is true then we can easily compute the following:
id_n = D(id_n)_x = D({f^-1}°{f})_x = {D(f^-1)_f(x)} ° {Df_x}

where D(*)_a means the derivative of * at x.

OKay... so this was very straightforward until I began to think about D(id_n)_x.
I know that this id_n is not the same as the first id_n. The id_n of D(id_n)_x is the matrix representing the transformation induced by the identity function. i.e. f(x_1,x_2,...x_n) = (x_1,... x_n). But then I began to wonder why it says "at x" (i.e. D(id_n)_x).

In the one dimensional case, f(x) = x. Or another way to look at it f(x) = 1°x. Where ° means composition and is ordinary multiplication, here. I think my problem is that I'm confusing the function with its "action" (by that I mean, I think "f=1" but f(x)=x). This is a very subtle distinction and so I hope that someone that appreciates this can give a not so off hand reply (i.e. are we differentiating the transformation of basis (identity) or the function? what does it look like?)
Anyway, so I'm doing to assume that in one dimension the first equation reduces to:
1=D(1)_x = ...
But I'm not sure what it means to say D(1)_x

Also could it be that the notes I'm reading are treating "x" as a "point" rather than a variable (and that the notation D(*)_x avoids explicitly stating w.r.t. which variable, other than "all of them/total derivative").
 
Physics news on Phys.org
Hey brydustin.

The intuitive way to think of showing that an inverse function exists (around the region) which is what the inverse function implies is essentially that the derivative change is non-zero which in the multi-variable case can be measured specifically by what is called the Jacobian.

The intuitive idea behind this the following:

Lets say we have a one dimensional function y = f(x). If we want an inverse y = f^-1(x) where f(f^-1(x)) = f^-1(f(x)) = x, then we have to have a one-to-one mapping.

If we have a situation where the derivative is zero, we get an ambiguous situation where we don't have a 1-1 mapping anymore and if the derivative changes sign after zero from its sign before, then it means that we can't have an inverse if we include our new region because our function for some value(s) will not be one-to-one in terms of an inverse.

The easiest example is to think of cos(x) from 0 to pi and then consider from pi to 2pi and consider what happens if we try to make an inverse function firstly for the restriction of the domain 0 to pi and then for 0 to 2pi. You will see that for the 0 to 2pi case you get a major problem since you get two-values if you tried to define an inverse, and since functions can only take one-value, then it means an inverse function for the 0 to 2pi case does not exist (but it does for 0 to pi).

For the multi-variable case, we can calculate whether this happens by calculating the Jacobian since it represents the volume-change in terms of the 'volume-differential' at a particular point just like dy/dx measures the change for the one-dimensional case. The matrix is a linear object and the Jacobian just represents the hypervolume (also called parallelpiped) of such thing and since we are dealing with derivatives, then it ends up measuring volume-differential change. (Also keep in mind that derivatives are linear operators and finite linear operators are just standard matrices).

In terms of studying inverses, it might help you if you looked at this from the theory of tensors because with tensors you can express going from system to another which means you can look at inverses in a systematic way and tensors really clarify these kinds of things.
 
I'm sorry, I appreciate your response but it had nothing to do with my question; I understand the intuition/proof behind the inverse function theorem, its just that there was that one computation that I don't understand:

id_n = D(id_n)_x = ... where I understand the ... part.

I probably shouldn't have even mentioned the inverse function theorem; anyway, my question is why are they computing D(id_n)_x. That's the derivative of the transformation at x, and THAT'S what doesn't make sense to me, NOT the inv fxn thrm (which does make sense).
 
The subscript means that you are differentiating the function at that point. In the definition of the derivative, we have an approximating linear map for each point ( assuming it is differentiable everywhere ).

In the one dimensional case, our notation is a little different. In the one dimensional case, we usually define the function " f ' ( x ) " , which represents the slope of the tangent line at each point. However, the idea is similar.

Review the multi-dimensional definition of the derivative operator, and you will see that it is a function so that D f ( a ) ( h ) is a linear mapping of the variable h. The "a" argument is where you evaluate the derivative.

P.S. D(Id)_x = Id , because the derivative of a linear map at any point is the linear map itself.
 
wisvuze said:
The subscript means that you are differentiating the function at that point. In the definition of the derivative, we have an approximating linear map for each point ( assuming it is differentiable everywhere ).

Review the multi-dimensional definition of the derivative operator, and you will see that it is a function so that D f ( a ) ( h ) is a linear mapping of the variable h. The "a" argument is where you evaluate the derivative.

I very much hate that there are so many different notations for the mult-dim derivative, in this case Df(a)(h) = D(f)_a , right (assuming that h is multidimensional (i.e. the total derivative))?

wisvuze said:
P.S. D(Id)_x = Id , because the derivative of a linear map at any point is the linear map itself.

Well, when you say it like that, its so simple. But what does Id "look" like. I.e. in the one dimensional case, I reckon its simply x. Is that correct? Or is it 1?
 
brydustin said:
I very much hate that there are so many different notations for the mult-dim derivative, in this case Df(a)(h) = D(f)_a , right (assuming that h is multidimensional (i.e. the total derivative))?



Well, when you say it like that, its so simple. But what does Id "look" like. I.e. in the one dimensional case, I reckon its simply x. Is that correct? Or is it 1?



Well, it definitely isn't the function "1", but I may be able to see where you are coming from.
As I said before, in the single-variable case, the derivative is *not* the best approximating linear function. Instead, we only calculate the "slope" of the best linear function ( this is O.K., since it can be shown that all continuous linear functions in 1 variable must be a line, so there really is no loss of generality ). So, the derivative of f( x ) = x is 1 , or f ' ( x ) = 1 .. that is, the slope of all approximating linear functions at any point is just 1. Now actually, translating this into the language of "best linear approximations", we have that at any point c, f( c ) is best approximated by the tangent line 1*x ( where 1 is the slope ).. et c. Now, this is fine, because we know that the "total derivative function" of a linear function should be the linear function itself, and this is just what we get.

Unfortunately, since we are working in n dimensions, it will be harder and harder to imagine what these linear operators will look like. However, the Id operator has an obvious algebraic interpretation.. it just means the identity ( which is a linear function, more abstractly )

I hope that the above provided a distinction between the "slope" of the linear operator in 1 variable v.s. the linear operator itself.


brydustin said:
I very much hate that there are so many different notations for the mult-dim derivative, in this case Df(a)(h) = D(f)_a , right (assuming that h is multidimensional (i.e. the total derivative))?


Yes, Df( a ) ( h ) = D(f)_a.. but be mindful that D(f)_a is an operator and not a value. It is linear with respect to h ( which is just something in R^n , say )
 
Actually I do see it the way I thought before, but clearly now.
f(x) = x iff f(x) = 1(x).

lim h->0 1(x+h) - 1(x)/h = lim h->0 [1(x)+1(h)-1(x)]/h = lim h->0 1(h)/h = 1

So actually I can "see" the derivative in the higher dimensions now.

If X is the row vector of variables then
f(X) =X and has the derivative I. We can think of this two ways. Either the direct Jacobian of X which results in I or we can think of f(X) = I(X) and then use the limit argument above. The problem I was having was the lack of an operator present in f(x)=x because we just don't usually write "1". x is just a "dummy variable", in that case it might make more sense to write f(*)= 1(*) to represent y=x, and therefore "close the gap" between what a function identity and algebraic identity is.
 

Similar threads

  • · Replies 1 ·
Replies
1
Views
2K
  • · Replies 6 ·
Replies
6
Views
4K
  • · Replies 11 ·
Replies
11
Views
2K
  • · Replies 4 ·
Replies
4
Views
3K
  • · Replies 11 ·
Replies
11
Views
3K
  • · Replies 4 ·
Replies
4
Views
2K
  • · Replies 5 ·
Replies
5
Views
3K
  • · Replies 3 ·
Replies
3
Views
12K
  • · Replies 3 ·
Replies
3
Views
4K
  • · Replies 3 ·
Replies
3
Views
3K