Common assumption in proof for Inverse function theorem

Click For Summary

Discussion Overview

The discussion revolves around the assumptions made in the proof of the Inverse Function Theorem, particularly the assumption that the derivative at a point, Df_a, equals the identity matrix. Participants explore the implications of this assumption and its validity, questioning how it affects the generality of the proof.

Discussion Character

  • Debate/contested
  • Technical explanation
  • Conceptual clarification

Main Points Raised

  • One participant questions why authors assume Df_a = id_n, suggesting it seems like a significant leap in the proof.
  • Another participant proposes that if Df is non-singular at point a, one can define a new function g such that Dg(a) = I, and if g is locally invertible, then f must also be invertible.
  • Some participants argue that the assumption Dg(a) = I does not directly prove Df(a) = id_n, but rather allows for the recovery of the case for Df(a) if proven for Dg(a).
  • A later reply emphasizes that proving the theorem for functions with a derivative equal to the identity can extend to those with an invertible derivative.
  • One participant provides an analogy using real-valued functions to illustrate the logic behind the assumptions made in the proof.
  • Another participant raises a question about the validity of the theorem for compositions of maps involving non-singular linear maps.

Areas of Agreement / Disagreement

Participants express differing views on the validity and implications of the assumption that Df_a = id_n. There is no consensus on whether this assumption is justified or how it affects the proof's generality.

Contextual Notes

Some participants note that the discussion involves assumptions about the nature of derivatives and their invertibility, which may not be universally applicable without additional context or conditions.

Who May Find This Useful

This discussion may be of interest to those studying advanced calculus, differential geometry, or mathematical analysis, particularly in relation to the Inverse Function Theorem and its applications.

brydustin
Messages
201
Reaction score
0
I don't understand why all authors of this proof assume that Df_a = id_n, how doesn't this destroy generality?

For example, see https://www.physicsforums.com/showthread.php?t=476508.
The λ in his post (and the post he quotes) is always Df_a (its not stated in that post, but in the book and the post that is quoted in that post). It doesn't seem like the answer is ever made.

My attempt at an answer:
I FEEL like the assumption is valid because its only a computation and therefore doesn't change the "structure" of the problem itself (i.e. the spaces are preserved). But it at first glance does seem like a pretty big leap in a proof. The same (or similar) argument is made in every proof I've seen : Spivak, and MIT's opencourse http://ocw.mit.edu/courses/mathematics/18-101-analysis-ii-fall-2005/lecture-notes/lecture7.pdf (i.e. Df(0) = id), also the proof given in Jerry Shurman's "Multivariable Calculus", the online book. This is a fairly simple question, could I have a simple answer?
 
Physics news on Phys.org
Let f map from Rn to Rn. And suppose Df is non-singular at point a. Let Df(a) be denoted by the matrix A which, again, is nonsingular.

Okay, now let g = A^(-1)f. This is a mapping from Rn to Rn, and Dg(a) = I. Okay, suppose we manage to prove that g is locally invertible. That is g^(-1)(y) exists locally near g(a). Well f = Ag is the composition of two invertible mappings, so its inverse must exist and equal (Ag)^-1 = g^-1 A^-1.
 
Vargo said:
Let f map from Rn to Rn. And suppose Df is non-singular at point a. Let Df(a) be denoted by the matrix A which, again, is nonsingular.

Okay, now let g = A^(-1)f. This is a mapping from Rn to Rn, and Dg(a) = I. Okay, suppose we manage to prove that g is locally invertible. That is g^(-1)(y) exists locally near g(a). Well f = Ag is the composition of two invertible mappings, so its inverse must exist and equal (Ag)^-1 = g^-1 A^-1.

You assumed exactly what I'm questioning about! WHY can we assume that Dg(a) = I. But that wasn't even quite my question, it was WHY can we assume that Df_a = id_n

Please try to answer the question that was asked,and not just restate the assumption without explanation.
 
brydustin said:
Please try to answer the question that was asked,and not just restate the assumption without explanation.

Vargo did answer your question. I recommend you reread his/her post.
 
jgens said:
Vargo did answer your question. I recommend you reread his/her post.

No it doesn't prove that Df(a)=id_n because it assumes that Dg(a)=id_n.

Then s/he goes on to prove IF A is invertible (assumed) and g is invertible then f^-1 = g^-1 A^-1. This DOES NOT prove that A = Id. It merely gives a value for the inverse of f given the composition of invertible functions, unless I'm severely misunderstanding... I don't think I am, perhaps you could elaborate on why its correct (assuming it is).
 
brydustin said:
No it doesn't prove that Df(a)=id_n because it assumes that Dg(a)=id_n.

You are not trying to show that Df(a) = id. The idea is to show that if we can prove the case when Dg(a) = id, then we can recover the case for Df(a).

Then s/he goes on to prove IF A is invertible (assumed) and g is invertible then f^-1 = g^-1 A^-1.

Exactly!

This DOES NOT prove that A = Id.

As I said earlier, you are not trying to prove this.

It merely gives a value for the inverse of f given the composition of invertible functions

Since f is the composition of one-to-one functions, it follows that f is one-to-one in that neighborhood too. This is what the theorem is claiming it shows, so this is good.

unless I'm severely misunderstanding

You most certainly are misunderstanding. Read the post again. My attempts to help you through this would look exactly like Vargo's.
 
Just to rephrase. (We make up f, define g, and let professor's proof apply to g, showing f invertible)Let A=Df, invertible.

We define g=A-1f.

Then Dg=D(A-1f)=A-1Df=I.

Thus, since Dg=I, instructor says g-1 exists.

Since f=Ag, it is not hard to show that g-1A-1 is the coveted f-1 we are searching for.
 
Last edited:
Okay, perhaps a simple example will clear up the logic. Let f(x) be a real valued function of 1 real variable (R into R). You want to know whether f(x) is locally invertible near x=a and you know that f'(a) = m is not zero.

According to your professor/textbook, this is known to be true IF you make the additional assumption that m=1.

Well now, we have the function f(x) whose derivative is equal to m which is not zero, but not necessarily 1 either. Let g (x) = (1/m)f(x). Its derivative at a is equal to (1/m)f'(a)=m/m=1. So according to the textbook, g(x) is locally invertible with inverse g^(-1). Let's see if we can invert f. So we have the equation y=f(x)=mg(x), and we want to solve for x in terms of y. Solving, we see that y/m = g(x), and we know that g is invertible so, x = g^(-1)(y/m). Therefore, f^(-1)(y) exists and equals g^(-1)(y/m).

This is exactly the same device that is used for mappings Rn to Rn, but perhaps, being easier to visualize, it is easier to see how the logic works here.
 
I will just add that the inverse function theorem in spirit is saying the following.

For x near a, we want to know whether we can solve the equations y=f(x) for x in terms of y. According to the definition of the derivative:

y= f(a) + Df_a(x-a) + o(|x-a|)

In other words, the Df_a is the closest linear approximation of our mapping (in a neighborhood of a). The inverse function theorem says that as long as Df_a is invertible, then locally you can solve the equation for y in terms of x and you get:

x-a = (Df_a)^(-1)(y-f(a)) + o(|y-f(a)|) .
 
  • #10
you seem confused by the use of variables. the best statement of the principle behind the proof would have been that, if we can prove the theorem for all functions with derivative equal to the identity, then we can also prove it for all functions with invertible derivative. I.e. you are confused by what letter is being employed to represent the function, f or g.
 
  • #11
mathwonk said:
you seem confused by the use of variables. the best statement of the principle behind the proof would have been that, if we can prove the theorem for all functions with derivative equal to the identity, then we can also prove it for all functions with invertible derivative. I.e. you are confused by what letter is being employed to represent the function, f or g.
Can u make further explanation for 'all functions with derivative equal to the identity, then we can also prove it for all functions with invertible derivative', how's that makes sense?
 
  • #12
J.T2015 said:
Can u make further explanation for 'all functions with derivative equal to the identity, then we can also prove it for all functions with invertible derivative', how's that makes sense?

Sort of like if you have a 1-1 relationship with spiders and sticks, and you find it easier to count sticks, then take any collection of spiders, map to the sticks, count the sticks, then map back to the spiders.

If you have a function f with invertible derivative, there is g such that g(f(x))=x. and g'(f(a))f'(a)=id_n. Now prove the statement for g(f(x)). Then it will be true for f also, somehow. This is the basic idea, I didn't read the attached proof recently (this is from May).
 
  • #13
Suppose the inverse function theorem were true for a composition of maps,

f\circL where L is a non-singular linear map. Would it be true for f?
 
  • #14
lavinia said:
Suppose the inverse function theorem were true for a composition of maps,

f\circL where L is a non-singular linear map. Would it be true for f?
Is this a response to the original post or a new question?
 
  • #15
HallsofIvy said:
Is this a response to the original post or a new question?

I thought that if L were equal to the inverse of the Jacobian of f at zero that might give a picture of what is going on.
 
Last edited:
  • #16
Can someone explain the determinant of the derivative in the neighbourhood of a is also nonzero when derivative at a is the identity map?
I found related sentence in wikipedia ''if the Jacobian determinant at p is positive, then F preserves orientation near p; if it is negative, F reverses orientation.'' But I don't know the reason for that. Any thoughts?
 
  • #17
J.T2015 said:
Can someone explain the determinant of the derivative in the neighbourhood of a is also nonzero when derivative at a is the identity map?
I found related sentence in wikipedia ''if the Jacobian determinant at p is positive, then F preserves orientation near p; if it is negative, F reverses orientation.'' But I don't know the reason for that. Any thoughts?

In the Inverse Function theorem the function is assumed to be continuously differentable.
Since the determinant is a polynomial this means the determinant of the Jacobian is a continuous function.
The determinant of the identity equals 1.
 
  • #18
J.T2015 said:
Can someone explain the determinant of the derivative in the neighbourhood of a is also nonzero when derivative at a is the identity map?
I found related sentence in wikipedia ''if the Jacobian determinant at p is positive, then F preserves orientation near p; if it is negative, F reverses orientation.'' But I don't know the reason for that. Any thoughts?

To get a feel for what is going on, assign an orientation to a line segment in R^2, and see what happens when , you apply the map (x,y)-->(-x,y) --a linear map of determinant -1--
to points in the line (x,0).Try something similar for an oriented rectangle in R^2, or R^3.
 

Similar threads

  • · Replies 11 ·
Replies
11
Views
2K
  • · Replies 8 ·
Replies
8
Views
3K
  • · Replies 2 ·
Replies
2
Views
2K
Replies
2
Views
2K
  • · Replies 6 ·
Replies
6
Views
4K
  • · Replies 10 ·
Replies
10
Views
3K
  • · Replies 1 ·
Replies
1
Views
2K
  • · Replies 35 ·
2
Replies
35
Views
4K
  • · Replies 3 ·
Replies
3
Views
3K
Replies
2
Views
2K