# Derivative of transformation of basis in R^n

1. May 6, 2012

### brydustin

I've just studied the implicit function theorem and if we assume the theorem is true then we can easily compute the following:
id_n = D(id_n)_x = D({f^-1}°{f})_x = {D(f^-1)_f(x)} ° {Df_x}

where D(*)_a means the derivative of * at x.

OKay.... so this was very straightforward until I began to think about D(id_n)_x.
I know that this id_n is not the same as the first id_n. The id_n of D(id_n)_x is the matrix representing the transformation induced by the identity function. i.e. f(x_1,x_2,...x_n) = (x_1,.... x_n). But then I began to wonder why it says "at x" (i.e. D(id_n)_x).

In the one dimensional case, f(x) = x. Or another way to look at it f(x) = 1°x. Where ° means composition and is ordinary multiplication, here. I think my problem is that I'm confusing the function with its "action" (by that I mean, I think "f=1" but f(x)=x). This is a very subtle distinction and so I hope that someone that appreciates this can give a not so off hand reply (i.e. are we differentiating the transformation of basis (identity) or the function? what does it look like?)
Anyway, so I'm doing to assume that in one dimension the first equation reduces to:
1=D(1)_x = .....
But I'm not sure what it means to say D(1)_x

Also could it be that the notes I'm reading are treating "x" as a "point" rather than a variable (and that the notation D(*)_x avoids explicitly stating w.r.t. which variable, other than "all of them/total derivative").

2. May 6, 2012

### chiro

Hey brydustin.

The intuitive way to think of showing that an inverse function exists (around the region) which is what the inverse function implies is essentially that the derivative change is non-zero which in the multi-variable case can be measured specifically by what is called the Jacobian.

The intuitive idea behind this the following:

Lets say we have a one dimensional function y = f(x). If we want an inverse y = f^-1(x) where f(f^-1(x)) = f^-1(f(x)) = x, then we have to have a one-to-one mapping.

If we have a situation where the derivative is zero, we get an ambiguous situation where we don't have a 1-1 mapping anymore and if the derivative changes sign after zero from its sign before, then it means that we can't have an inverse if we include our new region because our function for some value(s) will not be one-to-one in terms of an inverse.

The easiest example is to think of cos(x) from 0 to pi and then consider from pi to 2pi and consider what happens if we try to make an inverse function firstly for the restriction of the domain 0 to pi and then for 0 to 2pi. You will see that for the 0 to 2pi case you get a major problem since you get two-values if you tried to define an inverse, and since functions can only take one-value, then it means an inverse function for the 0 to 2pi case does not exist (but it does for 0 to pi).

For the multi-variable case, we can calculate whether this happens by calculating the Jacobian since it represents the volume-change in terms of the 'volume-differential' at a particular point just like dy/dx measures the change for the one-dimensional case. The matrix is a linear object and the Jacobian just represents the hypervolume (also called parallelpiped) of such thing and since we are dealing with derivatives, then it ends up measuring volume-differential change. (Also keep in mind that derivatives are linear operators and finite linear operators are just standard matrices).

In terms of studying inverses, it might help you if you looked at this from the theory of tensors because with tensors you can express going from system to another which means you can look at inverses in a systematic way and tensors really clarify these kinds of things.

3. May 6, 2012

### brydustin

I'm sorry, I appreciate your response but it had nothing to do with my question; I understand the intuition/proof behind the inverse function theorem, its just that there was that one computation that I don't understand:

id_n = D(id_n)_x = .... where I understand the ... part.

I probably shouldn't have even mentioned the inverse function theorem; anyway, my question is why are they computing D(id_n)_x. That's the derivative of the transformation at x, and THAT'S what doesn't make sense to me, NOT the inv fxn thrm (which does make sense).

4. May 6, 2012

### wisvuze

The subscript means that you are differentiating the function at that point. In the definition of the derivative, we have an approximating linear map for each point ( assuming it is differentiable everywhere ).

In the one dimensional case, our notation is a little different. In the one dimensional case, we usually define the function " f ' ( x ) " , which represents the slope of the tangent line at each point. However, the idea is similar.

Review the multi-dimensional definition of the derivative operator, and you will see that it is a function so that D f ( a ) ( h ) is a linear mapping of the variable h. The "a" argument is where you evaluate the derivative.

P.S. D(Id)_x = Id , because the derivative of a linear map at any point is the linear map itself.

5. May 6, 2012

### brydustin

I very much hate that there are so many different notations for the mult-dim derivative, in this case Df(a)(h) = D(f)_a , right (assuming that h is multidimensional (i.e. the total derivative))?

Well, when you say it like that, its so simple. But what does Id "look" like. I.e. in the one dimensional case, I reckon its simply x. Is that correct? Or is it 1?

6. May 6, 2012

### wisvuze

Well, it definitely isn't the function "1", but I may be able to see where you are coming from.
As I said before, in the single-variable case, the derivative is *not* the best approximating linear function. Instead, we only calculate the "slope" of the best linear function ( this is O.K., since it can be shown that all continuous linear functions in 1 variable must be a line, so there really is no loss of generality ). So, the derivative of f( x ) = x is 1 , or f ' ( x ) = 1 .. that is, the slope of all approximating linear functions at any point is just 1. Now actually, translating this into the language of "best linear approximations", we have that at any point c, f( c ) is best approximated by the tangent line 1*x ( where 1 is the slope ).. et c. Now, this is fine, because we know that the "total derivative function" of a linear function should be the linear function itself, and this is just what we get.

Unfortunately, since we are working in n dimensions, it will be harder and harder to imagine what these linear operators will look like. However, the Id operator has an obvious algebraic interpretation.. it just means the identity ( which is a linear function, more abstractly )

I hope that the above provided a distinction between the "slope" of the linear operator in 1 variable v.s. the linear operator itself.

Yes, Df( a ) ( h ) = D(f)_a.. but be mindful that D(f)_a is an operator and not a value. It is linear with respect to h ( which is just something in R^n , say )

7. May 6, 2012

### brydustin

Actually I do see it the way I thought before, but clearly now.
f(x) = x iff f(x) = 1(x).

lim h->0 1(x+h) - 1(x)/h = lim h->0 [1(x)+1(h)-1(x)]/h = lim h->0 1(h)/h = 1

So actually I can "see" the derivative in the higher dimensions now.

If X is the row vector of variables then
f(X) =X and has the derivative I. We can think of this two ways. Either the direct Jacobian of X which results in I or we can think of f(X) = I(X) and then use the limit argument above. The problem I was having was the lack of an operator present in f(x)=x because we just don't usually write "1". x is just a "dummy variable", in that case it might make more sense to write f(*)= 1(*) to represent y=x, and therefore "close the gap" between what a function identity and algebraic identity is.