Understanding the Inverse Function Theorem: A Step-by-Step Guide

In summary, the author puts "every linear map can be represented as a matrix. Inversely, every matrix is a linear map." and "if M is a linear map, then the notations M(h) and Mh or hM are equivalent."
  • #1
Castilla
241
0
Please, can you give me some hints about this?

Here is a proof of the inverse function theorem.

1. After the statement and proof of a previous lemma, the author puts (L o f) as a composite function. I don't understand this because L is a matrix (the jacobian matrix of f(a) ) and I have not seen in my book (Apostol's) that one can directly consider a matrix as a function that may be articulated with other function to build a composite one.

2. In the proof of Claim 1 the author puts this (j and i are subindexes):

l Dj gi (x) l = l Dj fi (x) - Dj fi (a) l

and I would thank if you can tell me how he introduces the Dj fi (a) there (because, following the definition of function g, I thought that Dj gi (x) = Dj fi (x) ).

Thanks for your good will and your time.
 

Attachments

  • proof of inverse function theorem.pdf
    87.4 KB · Views: 261
Physics news on Phys.org
  • #2
Come on, guys, show some pity.
 
  • #3
1. Here's something that might sound familiar: "Every linear map can be represented as a matrix. Inversely, every matrix is a linear map."

For exemple consider the product of this matrix with the "variable" vector (x,y):

[tex]\left( \begin{array}{cc} a & b \\ c & d \end{array}\right) \left( \begin{array}{c} x\\ y \end{array}\right)=\left( \begin{array}{c} ax+by \\cx +dy \end{array}\right)[/tex]

So the matrix (a b \ c d) is the matrix representation for the linear map [itex]L:\mathbb{R}^2\rightarrow \mathbb{R}^2[/itex] given by [itex]L(x,y)=(ax+by,cx+dy)[/itex].

At the very bigining of the text, the author also reminds you in a note that the Jacobian matrix of the inverse of f is, the inverse matrix of the product (Jf(y))(f^-1(y)). This is the analogue (or rather the generalization to higher dimension) of

[tex]\frac{d}{dx}f^{-1}(x)=\frac{1}{\left( \frac{df}{dx}\circ f^{-1}\right)(x)}[/tex]
 
Last edited:
  • #4
This said, can you now see what is the matrix representation of[itex](L\circ f)(x)[/itex]?

Another note: if M is a linear map, then the notations M(h) and Mh or are equivalent. The first refers explicitely to M as a function, while the other as a matrix, but the vector that M(h) and Mh represent is the same so they are equivalent.

(Edited following schmoe's correction)
 
Last edited:
  • #5
1. The composition is just matrix multiplication. If you have a matrix L, you are probably used to giving the corresponding linear transformation a different name, like T(x)=L*x, where * is usual matrix multiplication. They just kept the same letter L.

2. They reduced to the case that the Jf(a) is the identity matrix, so you know Dj fi (a). Compute Dj gi (x) carefully, it looks different depending on whether j=i or not.

quasar987 said:
Another note: if M is a linear map, then the notations M(h) and Mh or hM are equivalent. the first refers explicitely to M as a function, while the other two as a matrix, but the vector that M(h), Mh and hM represent is the same so they are all equivalent.

Mh and hM are different in general. Usually you stick to thinking of your linear transformation in terms of either left or right matrix multiplication and wouldn't switch back and forth in the same work.
 
  • #6
Thx schmoe, I edited my post. I wrote that without thinking because my friend once told me he always computed vector-matrix multiplication with the vector on the left. I then made a mental note that both ways gave the same answer. But what I didn't realize is that if you multiply on the left, it is actually the transpose of the vector that you're multiplying, which like you said, is a something a little different.
 
  • #7
quasar987 said:
At the very bigining of the text, the author also reminds you in a note that the Jacobian matrix of the inverse of f is, the inverse matrix of the product (Jf(y))(f^-1(y)). This is the analogue (or rather the generalization to higher dimension) of

[tex]\frac{d}{dx}f^{-1}(x)=\frac{1}{\left( \frac{df}{dx}\circ f^{-1}\right)(x)}[/tex]

But is not that generalization the purpose of the inverse function theorem, which the author has to prove? He can't put it as a previous fact.
 
  • #8
I guess it is, in a sense. But realize that the inverse function thm only gives conditons for the existence of a differentiable inverse.

With or without that thm, we can suppose that the inverse exists and is differentiable, and on that assumption, find its form.

Just like we could say that IF f and g are differentiable, then (f+g)'=f'+g', without first determining whether f and g are actually differentiable.
 
Last edited:
  • #9
ilike the proof in spivak, calculus on manifolds.

of course a matrix is a function, namely a linear function, just the kind that one uses to approximate a non linear function in th subject of differential calculus.the initial composition can be ignored. it is justb there to reduce from the case where the drivative is an invertible matrix, to the case where the derivative is the idebntity matrix.

so just start there. that's the hard part of the proof.
 
  • #10
here is a sketch of a rather magicalproof.

suppose the derivative is the identity called say 1. and the function itself is close to the identity, say f = 1 - a, where a is a small function whose derivative is zero.

for instance we could have the identity function x, plus the small function x^2, then our function we are trying to invert is x - x^2, with derivative 1.


now how to invert 1-a? well we all know the geometric series that says that 1/(1-a) = 1+a +a^2+a^3+...,

but this is the inverse of multiplying by 1-a, not a function of form 1-a.

so what? try it anyway. i.e. define the first approximate inverse as

1. that doesn't work of course since composing 1+a and 1 GIVES 1+a.

ok, try a second approximate inverse as 1 + a(1+a), where a(1+a) means composition, not multiplication.


then the thrid guess is 1+ a(1+a(1+a)). and so on,


now weclaim that this sequence of functions converse to an in verse of 1-a.

try composing:

i.e. composing (1-a)with(1+a) gives 1+a -a(1+a), and since 1+a is close to 1, the composition a(1+a) is close to a, so 1+a-a(1+a) is close to 1.


now composing (1-a) with 1 + a(1+a), gives:

1 + a(1+a) -a[1 + a(1+a)] which is close to

1 + a(1+a) -a[1 + a] = 1.

well i don't say this is a proof, but the point is that this procedure gives a sequnce of approximate inverses that, under some conditions, converge to an actual inverse.
 
  • #11
shmoe said:
1. The composition is just matrix multiplication.

Shmoe, how can I prove that?
 
  • #12
Castilla said:
Shmoe, how can I prove that?

there's nothing to prove, that's just how they would have defined the usage of the symbols. You wouldn't have a problem if they took a matrix L and defined a function T(x)=Lx, where x is a vector, then replaced the L's in the proof with T's where appropriate right? They're just saving up on letters and using "L" to denote the transformation "T". Go cross out the L's and turn them into T's, this is really a peripheral point that's not important to this proof, or anything really.
 
  • #13
shmoe said:
there's nothing to prove, that's just how they would have defined the usage of the symbols.

Please tell me if I am understanding well.

Let be the linear transformations L and F such that
L: Rn -> Rm and F: Rp -> Rn. Then for every x of Rp there is
a vector L(F(x)) belonging to Rm.

Now if L(x) = Lx for some matrix "L", and F(x) = Fx for some matrix "F", then there is a matrix "T" such that L(f(x)) = Tx.

If I understood you, you said that the matrix T = the product of matrices L and F in such order. Don´t we need a proof for this?
 
  • #14
Castilla said:
Please tell me if I am understanding well.

Let be the linear transformations L and F such that
L: Rn -> Rm and F: Rp -> Rn. Then for every x of Rp there is
a vector L(F(x)) belonging to Rm.

Now if L(x) = Lx for some matrix "L", and F(x) = Fx for some matrix "F", then there is a matrix "T" such that L(f(x)) = Tx.

If I understood you, you said that the matrix T = the product of matrices L and F in such order. Don´t we need a proof for this?

What you've said here follows from your definitions and matrix multiplication being associative. (assuming your L(f(x))=Tx was meant to be L(F(x))=Tx)

This isn't what was in your proof though, and isn't what I was talking about. You had some function f, that is not necessarily linear, and a matrix L, and wanted to know what (L o f) was correct? I'm saying they are treating L as a function by matrix multiplication. (L o f)(x)= L(f(x)) where L(f(x)) is just the matrix L times f(x).
 
  • #15
- I have a question...Does every function has inverse?...

using the identity [tex] f'[g(x)]g'(x)=1 [/tex] as a differential equation ,where f is known we could get g(x) so [tex] (f o g)(x)=x [/tex]
 
  • #17
I mean bijective of course..
 

1. What is the Inverse Function Theorem?

The Inverse Function Theorem is a mathematical theorem that states that if a function has a continuous derivative at a point, then it is locally invertible around that point. This means that there exists an inverse function in a small neighborhood around the point where the original function is defined.

2. Why is the Inverse Function Theorem important?

The Inverse Function Theorem is important because it allows us to find the inverse of a function and solve equations involving that function. It is also a fundamental tool in many areas of mathematics, such as calculus, differential equations, and optimization.

3. How does the Inverse Function Theorem relate to the derivative?

The Inverse Function Theorem is closely related to the derivative because it requires that the function has a continuous derivative at a point in order for the inverse to exist. This means that the derivative of the inverse function is the reciprocal of the derivative of the original function.

4. Can the Inverse Function Theorem be applied to any function?

The Inverse Function Theorem can only be applied to functions that have a continuous derivative at a point. It also requires that the function is one-to-one (injective) and onto (surjective) in order for the inverse to exist.

5. How is the Inverse Function Theorem used in practical applications?

The Inverse Function Theorem is used in practical applications such as optimization problems, where we need to find the maximum or minimum of a function. It is also useful in solving differential equations and in computer graphics and image processing to manipulate and transform images.

Similar threads

Replies
3
Views
2K
Replies
4
Views
1K
Replies
2
Views
284
Replies
1
Views
1K
Replies
4
Views
736
  • Calculus
Replies
9
Views
2K
Replies
1
Views
1K
  • Calculus
Replies
6
Views
1K
Replies
3
Views
1K
Back
Top