Understanding the Inverse Function Theorem: A Step-by-Step Guide

Castilla · Aug 22, 2006

Please, can you give me some hints about this?

Here is a proof of the inverse function theorem.

1. After the statement and proof of a previous lemma, the author puts (L o f) as a composite function. I don't understand this because L is a matrix (the jacobian matrix of f(a) ) and I have not seen in my book (Apostol's) that one can directly consider a matrix as a function that may be articulated with other function to build a composite one.

2. In the proof of Claim 1 the author puts this (j and i are subindexes):

l Dj gi (x) l = l Dj fi (x) - Dj fi (a) l

and I would thank if you can tell me how he introduces the Dj fi (a) there (because, following the definition of function g, I thought that Dj gi (x) = Dj fi (x) ).

Thanks for your good will and your time.

Castilla · Aug 25, 2006

Come on, guys, show some pity.

quasar987 · Aug 25, 2006

1. Here's something that might sound familiar: "Every linear map can be represented as a matrix. Inversely, every matrix is a linear map."

For exemple consider the product of this matrix with the "variable" vector (x,y):

[tex]\left( \begin{array}{cc} a & b \\ c & d \end{array}\right) \left( \begin{array}{c} x\\ y \end{array}\right)=\left( \begin{array}{c} ax+by \\cx +dy \end{array}\right)[/tex]

So the matrix (a b \ c d) is the matrix representation for the linear map [itex]L:\mathbb{R}^2\rightarrow \mathbb{R}^2[/itex] given by [itex]L(x,y)=(ax+by,cx+dy)[/itex].

At the very bigining of the text, the author also reminds you in a note that the Jacobian matrix of the inverse of f is, the inverse matrix of the product (Jf(y))(f^-1(y)). This is the analogue (or rather the generalization to higher dimension) of

[tex]\frac{d}{dx}f^{-1}(x)=\frac{1}{\left( \frac{df}{dx}\circ f^{-1}\right)(x)}[/tex]

quasar987 · Aug 25, 2006

This said, can you now see what is the matrix representation of[itex](L\circ f)(x)[/itex]?

Another note: if M is a linear map, then the notations M(h) and Mh or are equivalent. The first refers explicitely to M as a function, while the other as a matrix, but the vector that M(h) and Mh represent is the same so they are equivalent.

(Edited following schmoe's correction)

shmoe · Aug 25, 2006

1. The composition is just matrix multiplication. If you have a matrix L, you are probably used to giving the corresponding linear transformation a different name, like T(x)=L*x, where * is usual matrix multiplication. They just kept the same letter L.

2. They reduced to the case that the Jf(a) is the identity matrix, so you know Dj fi (a). Compute Dj gi (x) carefully, it looks different depending on whether j=i or not.

quasar987 said:

Another note: if M is a linear map, then the notations M(h) and Mh or hM are equivalent. the first refers explicitely to M as a function, while the other two as a matrix, but the vector that M(h), Mh and hM represent is the same so they are all equivalent.

Mh and hM are different in general. Usually you stick to thinking of your linear transformation in terms of either left or right matrix multiplication and wouldn't switch back and forth in the same work.

quasar987 · Aug 25, 2006

Thx schmoe, I edited my post. I wrote that without thinking because my friend once told me he always computed vector-matrix multiplication with the vector on the left. I then made a mental note that both ways gave the same answer. But what I didn't realize is that if you multiply on the left, it is actually the transpose of the vector that you're multiplying, which like you said, is a something a little different.

Castilla · Aug 25, 2006

quasar987 said:

At the very bigining of the text, the author also reminds you in a note that the Jacobian matrix of the inverse of f is, the inverse matrix of the product (Jf(y))(f^-1(y)). This is the analogue (or rather the generalization to higher dimension) of

[tex]\frac{d}{dx}f^{-1}(x)=\frac{1}{\left( \frac{df}{dx}\circ f^{-1}\right)(x)}[/tex]

But is not that generalization the purpose of the inverse function theorem, which the author has to prove? He can't put it as a previous fact.

quasar987 · Aug 25, 2006

I guess it is, in a sense. But realize that the inverse function thm only gives conditons for the existence of a differentiable inverse.

With or without that thm, we can suppose that the inverse exists and is differentiable, and on that assumption, find its form.

Just like we could say that IF f and g are differentiable, then (f+g)'=f'+g', without first determining whether f and g are actually differentiable.

mathwonk · Aug 25, 2006

ilike the proof in spivak, calculus on manifolds.

of course a matrix is a function, namely a linear function, just the kind that one uses to approximate a non linear function in th subject of differential calculus.the initial composition can be ignored. it is justb there to reduce from the case where the drivative is an invertible matrix, to the case where the derivative is the idebntity matrix.

so just start there. that's the hard part of the proof.

mathwonk · Aug 25, 2006

here is a sketch of a rather magicalproof.

suppose the derivative is the identity called say 1. and the function itself is close to the identity, say f = 1 - a, where a is a small function whose derivative is zero.

for instance we could have the identity function x, plus the small function x^2, then our function we are trying to invert is x - x^2, with derivative 1.

now how to invert 1-a? well we all know the geometric series that says that 1/(1-a) = 1+a +a^2+a^3+...,

but this is the inverse of multiplying by 1-a, not a function of form 1-a.

so what? try it anyway. i.e. define the first approximate inverse as

1. that doesn't work of course since composing 1+a and 1 GIVES 1+a.

ok, try a second approximate inverse as 1 + a(1+a), where a(1+a) means composition, not multiplication.

then the thrid guess is 1+ a(1+a(1+a)). and so on,

now weclaim that this sequence of functions converse to an in verse of 1-a.

try composing:

i.e. composing (1-a)with(1+a) gives 1+a -a(1+a), and since 1+a is close to 1, the composition a(1+a) is close to a, so 1+a-a(1+a) is close to 1.

now composing (1-a) with 1 + a(1+a), gives:

1 + a(1+a) -a[1 + a(1+a)] which is close to

1 + a(1+a) -a[1 + a] = 1.

well i don't say this is a proof, but the point is that this procedure gives a sequnce of approximate inverses that, under some conditions, converge to an actual inverse.

Castilla · Aug 28, 2006

shmoe said:

1. The composition is just matrix multiplication.

Shmoe, how can I prove that?

shmoe · Aug 28, 2006

Castilla said:

Shmoe, how can I prove that?

there's nothing to prove, that's just how they would have defined the usage of the symbols. You wouldn't have a problem if they took a matrix L and defined a function T(x)=Lx, where x is a vector, then replaced the L's in the proof with T's where appropriate right? They're just saving up on letters and using "L" to denote the transformation "T". Go cross out the L's and turn them into T's, this is really a peripheral point that's not important to this proof, or anything really.

Castilla · Aug 28, 2006

shmoe said:

there's nothing to prove, that's just how they would have defined the usage of the symbols.

Please tell me if I am understanding well.

Let be the linear transformations L and F such that
L: Rn -> Rm and F: Rp -> Rn. Then for every x of Rp there is
a vector L(F(x)) belonging to Rm.

Now if L(x) = Lx for some matrix "L", and F(x) = Fx for some matrix "F", then there is a matrix "T" such that L(f(x)) = Tx.

If I understood you, you said that the matrix T = the product of matrices L and F in such order. Don´t we need a proof for this?

shmoe · Aug 28, 2006

Castilla said:

Please tell me if I am understanding well.

Let be the linear transformations L and F such that
L: Rn -> Rm and F: Rp -> Rn. Then for every x of Rp there is
a vector L(F(x)) belonging to Rm.

Now if L(x) = Lx for some matrix "L", and F(x) = Fx for some matrix "F", then there is a matrix "T" such that L(f(x)) = Tx.

If I understood you, you said that the matrix T = the product of matrices L and F in such order. Don´t we need a proof for this?

What you've said here follows from your definitions and matrix multiplication being associative. (assuming your L(f(x))=Tx was meant to be L(F(x))=Tx)

This isn't what was in your proof though, and isn't what I was talking about. You had some function f, that is not necessarily linear, and a matrix L, and wanted to know what (L o f) was correct? I'm saying they are treating L as a function by matrix multiplication. (L o f)(x)= L(f(x)) where L(f(x)) is just the matrix L times f(x).

lokofer · Aug 31, 2006

- I have a question...Does every function has inverse?...

using the identity [tex] f'[g(x)]g'(x)=1 [/tex] as a differential equation ,where f is known we could get g(x) so [tex] (f o g)(x)=x [/tex]

quasar987 · Aug 31, 2006

There is an inverse to f iff f is surjective.

(though I am 95% sure of this affirmation, thie wiki article doesn't make it sound so: http://en.wikipedia.org/wiki/Inverse_function_theorem) :grumpy:

quasar987 · Sep 1, 2006

I mean bijective of course..

Understanding the Inverse Function Theorem: A Step-by-Step Guide

Attachments

1. What is the Inverse Function Theorem?

2. Why is the Inverse Function Theorem important?

3. How does the Inverse Function Theorem relate to the derivative?

4. Can the Inverse Function Theorem be applied to any function?

5. How is the Inverse Function Theorem used in practical applications?

Similar threads

Hot Threads

Recent Insights