Inverse function theorem

1. Aug 22, 2006

Castilla

Here is a proof of the inverse function theorem.

1. After the statement and proof of a previous lemma, the author puts (L o f) as a composite function. I don't understand this because L is a matrix (the jacobian matrix of f(a) ) and I have not seen in my book (Apostol's) that one can directly consider a matrix as a function that may be articulated with other function to build a composite one.

2. In the proof of Claim 1 the author puts this (j and i are subindexes):

l Dj gi (x) l = l Dj fi (x) - Dj fi (a) l

and I would thank if you can tell me how he introduces the Dj fi (a) there (because, following the definition of function g, I thought that Dj gi (x) = Dj fi (x) ).

Attached Files:

• proof of inverse function theorem.pdf
File size:
87.4 KB
Views:
73
2. Aug 25, 2006

Castilla

Come on, guys, show some pity.

3. Aug 25, 2006

quasar987

1. Here's something that might sound familiar: "Every linear map can be represented as a matrix. Inversely, every matrix is a linear map."

For exemple consider the product of this matrix with the "variable" vector (x,y):

$$\left( \begin{array}{cc} a & b \\ c & d \end{array}\right) \left( \begin{array}{c} x\\ y \end{array}\right)=\left( \begin{array}{c} ax+by \\cx +dy \end{array}\right)$$

So the matrix (a b \ c d) is the matrix representation for the linear map $L:\mathbb{R}^2\rightarrow \mathbb{R}^2$ given by $L(x,y)=(ax+by,cx+dy)$.

At the very bigining of the text, the author also reminds you in a note that the Jacobian matrix of the inverse of f is, the inverse matrix of the product (Jf(y))(f^-1(y)). This is the analogue (or rather the generalization to higher dimension) of

$$\frac{d}{dx}f^{-1}(x)=\frac{1}{\left( \frac{df}{dx}\circ f^{-1}\right)(x)}$$

Last edited: Aug 25, 2006
4. Aug 25, 2006

quasar987

This said, can you now see what is the matrix representation of$(L\circ f)(x)$?

Another note: if M is a linear map, then the notations M(h) and Mh or are equivalent. The first refers explicitely to M as a function, while the other as a matrix, but the vector that M(h) and Mh represent is the same so they are equivalent.

(Edited following schmoe's correction)

Last edited: Aug 25, 2006
5. Aug 25, 2006

shmoe

1. The composition is just matrix multiplication. If you have a matrix L, you are probably used to giving the corresponding linear transformation a different name, like T(x)=L*x, where * is usual matrix multiplication. They just kept the same letter L.

2. They reduced to the case that the Jf(a) is the identity matrix, so you know Dj fi (a). Compute Dj gi (x) carefully, it looks different depending on whether j=i or not.

Mh and hM are different in general. Usually you stick to thinking of your linear transformation in terms of either left or right matrix multiplication and wouldn't switch back and forth in the same work.

6. Aug 25, 2006

quasar987

Thx schmoe, I edited my post. I wrote that without thinking because my friend once told me he always computed vector-matrix multiplication with the vector on the left. I then made a mental note that both ways gave the same answer. But what I didn't realise is that if you multiply on the left, it is actually the transpose of the vector that you're multiplying, which like you said, is a something a little different.

7. Aug 25, 2006

Castilla

But is not that generalization the purpose of the inverse function theorem, which the author has to prove? He can't put it as a previous fact.

8. Aug 25, 2006

quasar987

I guess it is, in a sense. But realise that the inverse function thm only gives conditons for the existence of a differentiable inverse.

With or without that thm, we can suppose that the inverse exists and is differentiable, and on that assumption, find its form.

Just like we could say that IF f and g are differentiable, then (f+g)'=f'+g', without first determining whether f and g are actually differentiable.

Last edited: Aug 25, 2006
9. Aug 25, 2006

mathwonk

ilike the proof in spivak, calculus on manifolds.

of course a matrix is a function, namely a linear function, just the kind that one uses to approximate a non linear function in th subject of differential calculus.

the initial composition can be ignored. it is justb there to reduce from the case where the drivative is an invertible matrix, to the case where the derivative is the idebntity matrix.

so just start there. thats the hard part of the proof.

10. Aug 25, 2006

mathwonk

here is a sketch of a rather magicalproof.

suppose the derivative is the identity called say 1. and the function itself is close to the identity, say f = 1 - a, where a is a small function whose derivative is zero.

for instance we could have the identity function x, plus the small function x^2, then our function we are trying to invert is x - x^2, with derivative 1.

now how to invert 1-a? well we all know the geometric series that says that 1/(1-a) = 1+a +a^2+a^3+.....,

but this is the inverse of multiplying by 1-a, not a function of form 1-a.

so what? try it anyway. i.e. define the first approximate inverse as

1. that doesn't work of course since composing 1+a and 1 GIVES 1+a.

ok, try a second approximate inverse as 1 + a(1+a), where a(1+a) means composition, not multiplication.

then the thrid guess is 1+ a(1+a(1+a)). and so on,

now weclaim that this sequence of fucntions converse to an in verse of 1-a.

try composing:

i.e. composing (1-a)with(1+a) gives 1+a -a(1+a), and since 1+a is close to 1, the composition a(1+a) is close to a, so 1+a-a(1+a) is close to 1.

now composing (1-a) with 1 + a(1+a), gives:

1 + a(1+a) -a[1 + a(1+a)] which is close to

1 + a(1+a) -a[1 + a] = 1.

well i dont say this is a proof, but the point is that this procedure gives a sequnce of approximate inverses that, under some conditions, converge to an actual inverse.

11. Aug 28, 2006

Castilla

Shmoe, how can I prove that?

12. Aug 28, 2006

shmoe

there's nothing to prove, that's just how they would have defined the usage of the symbols. You wouldn't have a problem if they took a matrix L and defined a function T(x)=Lx, where x is a vector, then replaced the L's in the proof with T's where appropriate right? They're just saving up on letters and using "L" to denote the transformation "T". Go cross out the L's and turn them into T's, this is really a peripheral point that's not important to this proof, or anything really.

13. Aug 28, 2006

Castilla

Please tell me if I am understanding well.

Let be the linear transformations L and F such that
L: Rn -> Rm and F: Rp -> Rn. Then for every x of Rp there is
a vector L(F(x)) belonging to Rm.

Now if L(x) = Lx for some matrix "L", and F(x) = Fx for some matrix "F", then there is a matrix "T" such that L(f(x)) = Tx.

If I understood you, you said that the matrix T = the product of matrices L and F in such order. Don´t we need a proof for this?

14. Aug 28, 2006

shmoe

What you've said here follows from your definitions and matrix multiplication being associative. (assuming your L(f(x))=Tx was meant to be L(F(x))=Tx)

This isn't what was in your proof though, and isn't what I was talking about. You had some function f, that is not necessarily linear, and a matrix L, and wanted to know what (L o f) was correct? I'm saying they are treating L as a function by matrix multiplication. (L o f)(x)= L(f(x)) where L(f(x)) is just the matrix L times f(x).

15. Aug 31, 2006

lokofer

- I have a question....Does every function has inverse?...

using the identity $$f'[g(x)]g'(x)=1$$ as a differential equation ,where f is known we could get g(x) so $$(f o g)(x)=x$$

16. Aug 31, 2006

quasar987

17. Sep 1, 2006

quasar987

I mean bijective of course..