Multivariable Differentiation .... McInerney Definition 3.1.1 ....

Click For Summary

Discussion Overview

The discussion revolves around the interpretation and application of Definition 3.1.1 from Andrew McInerney's book on multivariable differentiation, specifically focusing on the derivative and linear approximation in the context of a given example involving a function from \(\mathbb{R}\) to \(\mathbb{R}^2\).

Discussion Character

  • Technical explanation
  • Debate/contested
  • Mathematical reasoning

Main Points Raised

  • Peter presents an example function \(f: \mathbb{R} \to \mathbb{R}^2\) and seeks assistance in applying Definition 3.1.1 to find \(T_a(h)\).
  • Some participants suggest that there may be a typo in the definition, specifically regarding the placement of terms in the limit expression.
  • One participant proposes that \(T_a(h) = (2h, 3h)\) is a suitable linear transformation, asserting that it simplifies the limit to zero as \(h\) approaches zero.
  • Another participant argues that the function \(T_a(h) = (2, 3)\) is not a linear transformation because it does not map zero to zero, while others counter that it can be interpreted as a linear map when multiplied by \(h\).
  • There is a discussion about the nature of the function \(f\) being affine and how that affects the definition of the linear transformation.
  • Several participants express differing views on whether the original definition is correct or if it contains errors, with some defending the definition as consistent with other mathematical texts.
  • Participants debate the implications of the Jacobian matrix and its relationship to linear maps, with some emphasizing the need for clarity in definitions and notation.

Areas of Agreement / Disagreement

Participants express multiple competing views regarding the correctness of the definition and the appropriate form of \(T_a(h)\). The discussion remains unresolved, with no consensus on whether the definition contains errors or if the proposed transformations are valid.

Contextual Notes

There are unresolved questions regarding the assumptions made in the limit expressions and the definitions of linear versus affine transformations. The discussion highlights the complexity of applying definitions in multivariable calculus and the potential for differing interpretations.

Math Amateur
Gold Member
MHB
Messages
3,920
Reaction score
48
I am reading Andrew McInerney's book: First Steps in Differential Geometry: Riemannian, Contact, Symplectic ... and I am focused on Chapter 3: Advanced Calculus ... and in particular on Section 3.1: The Derivative and Linear Approximation ...

I am trying to fully understand Definition 3.1.1 and need help with an example based on the definition ...

Definition 3.1.1 reads as follows:
240744

I constructed the following example ...

Let ##f: \mathbb{R} \to \mathbb{R}^2 ##

such that ##f = ( f^1, f^2 )##

where ##f^1(x) = 2x## and ##f^2(x) = 3x + 1##

We wish to determine ##T_a(h)## ... We have ##f(a + h) = ( f^1(a + h), f^2(a + h) )= (2a + 2h, 3a + 3h +1 )##

and

##f(a ) = ( f^1(a ), f^2(a ) ) = (2a , 3a +1 )##
Now ... consider ... ##\displaystyle \lim_{ \mid \mid h \mid \mid \to 0} \frac{ \mid \mid f(a + h) - f(a) - T_a(h) \mid \mid }{ \mid \mid h \mid \mid } ####\Longrightarrow \displaystyle \lim_{ \mid \mid h \mid \mid \to 0} \frac{ \mid \mid (2a + 2h, 3a + 3h +1) - (2a, 3a + 1) - T_a(h) \mid \mid }{ \mid \mid h \mid \mid }####\Longrightarrow \displaystyle \lim_{ \mid \mid h \mid \mid \to 0} \frac{ \mid \mid ( 2h, 3h ) - T_a(h) \mid \mid }{ \mid \mid h \mid \mid }##... ... but how do I proceed from here ... ?

... can I take ##T_a (h) = T_a.h## ... but how do I justify this?Hope someone can help ...

Peter
 

Attachments

  • McInerney - Definition 3.1.1.png
    McInerney - Definition 3.1.1.png
    7.5 KB · Views: 967
Physics news on Phys.org
It has what I think is a typo: ## −|T_a(h)| ## should be outside the expression and without an ## ||h||## below it. ## \\ ## Was denkst Du (German) @fresh_42 ?, i.e. might you have an input here? I think the author goofed. ## \\ ## ## T_a (h)=(2,3) ## as ## h \rightarrow 0 ##. Again the author goofed. All this requires is a simple knowledge of calculus to see the author goofed. ## \\ ## I could be wrong, but I think I have this correct here.
 
Last edited:
First, in your example, ##h \in \mathbb{R}## so you can write ##|h|## instead of ##\Vert h \Vert##.

You are looking for a linear transformation ##T_a## such that ##\Vert (2h, 3h) - T_a(h) \Vert/ |h| \to 0## when ##h \to 0##.

In this case it is quite easy: we can simply make the numerator ##0## for all h such that the limit will be ##0## as well.

Simply pick ##T_a## the linear map defined by ##T_a(h) = (2h, 3h)##.

When you will progress in differentiation theory, you will see that you don't always need the definition to find the linear map. You will show that ##T_a## is the linear map determined by the Jacobian matrix (the one with all the partial derivatives as entries), and that's here the case as well.
 
Last edited by a moderator:
  • Like
Likes   Reactions: Charles Link
In this example the linear function is ##T_a(h)=(2h,3h)##. In fact, that will be the case for any ##a## because the function ##f## is affine, which means it is a combination of a linear function with a translation.

Golly I'm having trouble with the new interface. It has taken me about six goes to make MathJax say what I want it to.

PS the general rule is, for n=1n=1, we have:
$$T_a(h)=\left(\frac{df^1}{dx}(a),...,\frac{df^m}{dx}(a)\right)$$
I am not brave enough to try writing what it is when ##n>1## before I get used to the new MathJax. Too many super- and sub-scripts.
 
Last edited:
Charles Link said:
It has what I think is a typo: ## −|T_a(h)| ## should be outside the expression and without an ## ||h||## below it. ## \\ ## Was denkst Du (German) @fresh_42 ?, i.e. might you have an input here? I think the author goofed.

The definition is perfectly fine. It is the same one as Rudin, Apostol and Spivak use in resp. Principles of mathematial analysis, mathematical analysis and calculus on manifolds.
 
  • Like
Likes   Reactions: Charles Link
andrewkirk said:
In this example the linear function is ##T_a(h)=(2(h-a),3(h-a))##. In fact, that will be the case for any ##a## because the function ##f## is affine, which means it is a combination of a linear function with a translation.

That function is not a linear transformation, but an affine one. We need ##a = 0##.
 
@Math_QED Please fix your post 3, and reconsider my post 2. :smile:
 
Charles Link said:
@Math_QED Please fix your post 3, and reconsider my post 2. :smile:

I don't understand this post. Are you suggesting that the definition of the author is wrong? I don't think so to be honest. Can you write clearly out what you think should be the correct definition of derivative then?
 
## Df(a)=\frac{f(a+h)-f(a)}{h}=T_a(h) ## as ## h \rightarrow 0 ##.
 
  • #10
Charles Link said:
It has what I think is a typo: ## −|T_a(h)| ## should be outside the expression and without an ## ||h||## below it. ## \\ ## Was denkst Du (German) @fresh_42 ?, i.e. might you have an input here? I think the author goofed. ## \\ ## ## T_a (h)=(2,3) ## as ## h \rightarrow 0 ##. Again the author goofed. All this requires is a simple knowledge of calculus to see the author goofed. ## \\ ## I could be wrong, but I think I have this correct here.

Also, ##T_a(h) = (2,3)## is no linear transformation. It does not send ##0## to ##0##. It should be ##T_a(h) = (2h,3h)##, as in post 3.

Are you maybe confusing the matrix and the linear transformation? Some authors define the derivative to be the matrix, not the linear transform.
 
  • Like
Likes   Reactions: Charles Link
  • #11
Charles Link said:
## \frac{f(a+h)-f(a)}{h}=T_a(h) ## as ## h \rightarrow 0 ##.
The limit of the LHS as ##h\to 0## is the vector (2,3), and the limit of the RHS is the vector (0,0).
If we call the limit of the LHS ##Df(a)## then we have:

$$T_a(h) = h\ Df(a) = h\ (2, 3) = (2h,3h)$$
 
  • Like
Likes   Reactions: Charles Link
  • #12
@andrewkirk I agree=that fixes everything. The author says ## Df(a)=T_a=T_a(0) ## and that is clearly incorrect.
 
  • Like
Likes   Reactions: member 587159
  • #13
@Math_QED ## \\ ## Rewriting my post 9: ## Df(a)=\frac{f(a+h)-f(a)}{h}=\frac{T_a(h)}{h}=D_a(h) ## as ## h \rightarrow 0 ## , ignoring the normalization lines, or whatever they call them.
 
  • Like
Likes   Reactions: member 587159
  • #14
Math_QED said:
Also, ##T_a(h) = (2,3)## is no linear transformation. It does not send ##0## to ##0##. It should be ##T_a(h)=(2h,3h)##, as in post 3.
That's a bit nitpicking with a bad hand. E.g. you say that the Jacobi matrix is a linear map, but don't allow ##J=(2)## be called linear? If we say ##(2,3)## is a linear map, then ##h\longmapsto (2,3)\cdot h## is automatically meant. There is no reason to assume ##+(2,3)## (or ##\equiv (2,3)##) which you silently did.
 
  • Like
Likes   Reactions: Charles Link
  • #15
fresh_42 said:
That's a bit nitpicking with a bad hand. E.g. you say that the Jacobi matrix is a linear map, but don't allow ##J=(2)## be called linear? If we say ##(2,3)## is a linear map, then ##h\longmapsto (2,3)\cdot h## is automatically meant. There is no reason to assume ##+(2,3)## (or ##\equiv (2,3)##) which you silently did.

I didn't call the Jacobi matrix a linear map. I said it determines a linear map (by multiplying a vector with it). I am aware that linear maps and matrices can be identified under a vector space isomorphism, but that didn't seem relevant here as this is very elementary and identifications can be confusing.
 
  • Like
Likes   Reactions: Charles Link
  • #16
Charles Link said:
It has what I think is a typo: ## −|T_a(h)| ## should be outside the expression and without an ## ||h||## below it. ## \\ ## Was denkst Du (German) @fresh_42 ?, i.e. might you have an input here? I think the author goofed. ## \\ ## ## T_a (h)=(2,3) ## as ## h \rightarrow 0 ##. Again the author goofed. All this requires is a simple knowledge of calculus to see the author goofed. ## \\ ## I could be wrong, but I think I have this correct here.
But then you would be subtracting a scalar within the norm from the matrix ( or viceversa), getting an expression that is not a real number. How do we then get the derivative ( at the given point), which is a number?
 
  • #17
WWGD said:
But then you would be subtracting a scalar within the norm from the matrix ( or viceversa), getting an expression that is not a real number. How do we then get the derivative ( at the given point), which is a number?
My post 13 which is an addendum and minor correction to post 9 answers the question. See also post 11 by @andrewkirk It is unclear at this time whether the textbook or the OP left off the factor of ## h ## in the equation ## h D_a(h)=T_a(h) ## that made it incorrect in the original OP. The problem has been solved, and I am also now in agreement with the posts by @Math_QED . :smile::smile::smile:
 
  • Like
Likes   Reactions: member 587159, Math Amateur and WWGD
  • #18
If I may, to the OP, just warn about something many, including myself, find confusing: the terms : differential, derivative, which is the map, which is the value of the map, etc.
 
  • Like
Likes   Reactions: Math Amateur and Charles Link
  • #19
Thanks to all who contributed thoughts and analysis ...

Altogether most helpful ...

Peter
 
  • Like
Likes   Reactions: member 587159 and Charles Link
  • #20
WWGD said:
If I may, to the OP, just warn about something many, including myself, find confusing: the terms : differential, derivative, which is the map, which is the value of the map, etc.
... and a linear map on an algebra, e.g. a derivative of a function space, which obeys ##D(fg)=D(f)g+fD(g)## is called a derivation :oldbiggrin:
 
  • Like
Likes   Reactions: Math Amateur
  • #21
Charles Link said:
@Math_QED ## \\ ## Rewriting my post 9: ## Df(a)=\frac{f(a+h)-f(a)}{h}=\frac{T_a(h)}{h}=D_a(h) ## as ## h \rightarrow 0 ## , ignoring the normalization lines, or whatever they call them.

I think what the OP wrote originally is correct; it just amounts to saying that
$$f(a+h) = f(a) + D_a(h) + r(h),$$ where ##D_a(h) ## is a linear function of h (of the form ##c_1 h_1 + c_2 h_2 + c_3 h_3## for a 3-variable function ##f##) and ##r(h)## is of "higher than first order" in ##h##--that is, ##r(h)/\|h \| \to 0## as ##h \to 0.## The term ##r(h)## describes the deviation between the actual function surface (curved) and the tangent-plane (flat).
 
  • Like
Likes   Reactions: Charles Link and fresh_42
  • #22
Inded the definition given in post #1 is absolutely correct. One big confusion here is between the terms linear and affine. Note that the variable which could be called x, equals a+h, so h = x-a. Thus a function which is linear in h, is affine in x. I.e. when discussing a function's derivative, or differential, at a, we are looking at linear functions on the tangent space at a, which consists of vectors h added to a, so we want to consider a itself as the origin of the tangent space at a, so the point x should represent the tangent vector h = x-a. This confuses almost everyone, certainly me. I once had to break it gently to a full professor colleague, and a highly trained analyst at that, that he had made this error in correcting (incorrectly) the prelim exam of a grad student.

Another possible confusion, less common today, is that between whether the derivative is a (some) numbers, or a linear transformation. Permit me to quote the inimitable Jean Dieudonne', writing in 1960: (Foundations of modern analysis), chapter VIII:

"..[Differential] calculus is [here] presented in a way that will probably be new to most students...namely the local approximation of functions by linear functions. In the classical teaching of calculus, this idea is immediately obscured by the accidental fact that, on a one dimensional vector space, there is a one to one correspondence between linear forms and numbers, and therefore the derivative is defined as a number instead of a linear form. This slavish subservience to the shibboleth of numerical interpretation at any cost, becomes much worse when dealing with functions of several variables: one thus arrives at the classical formula giving the partial derivatives of a composite function, which has lost any trace of intuitive meaning, whereas the natural statement of the theorem is of course that the (total) derivative of a composite function is the composite of their derivatives, a very sensible formulation when one thinks in terms of linear approximations."

After reading this, some 50 years ago, I have never got over the impression it made on me, nor forgotten its message. Still almost 60 years later, this message has not reached everywhere, due to its relative abstractness. Hence it would be well if current authors would work harder to make the transition more palatable and clear.
 
Last edited:
  • Like
Likes   Reactions: lavinia, member 587159 and Charles Link
  • #23
Indeed I may have added to the confusion by quoting Dieudonne' speaking of "approximating functions by linear functions" without being more precise. It has been observed that the function f(x) is being approximated near x=a, by the function f(a) + Ta(x-a), or f(a) + Ta(h), where h = x-a. Now this function is still affine as a function of h. The point here is that we are approximating, not f, but f - f(a), by a linear function. Thus indeed f itself is approximated by an affine function. However when one relocates the coordinate system at the point (a, f(a)), we think of approximating the function f(a+h) - f(a).

I.e. in old fashioned terminology, we are considering "changes", and the point is that if f(x) = y, we are claiming that the change in y is approximated by a linear function of the change in x. Loomis makes this very clear on page 141 of his book with Sternberg, Advanced Calculus (available free on Sternbergs page at Harvard: http://www.math.harvard.edu/~shlomo/).

I.e. write ∆af(h) = f(a+h) - f(a), which is the change in y, or the change in f. Then we want to approximate ∆af(h) by the linear function Daf(h), Loomis' notation for the linear function Ta(h). We get that ∆af(h) = Daf(h) + o(h), where o(h) is an "infinitesimal", i.e. a function of h that approaches zero faster than h does, as h-->0, i.e. such that o(h)/|h| -->0, as h-->0.

Briefly, at each point a, we approximate ∆f, the change in f, by a linear function Df, of the change in x.
 
Last edited:

Similar threads

  • · Replies 11 ·
Replies
11
Views
3K
  • · Replies 2 ·
Replies
2
Views
2K
  • · Replies 46 ·
2
Replies
46
Views
10K
  • · Replies 3 ·
Replies
3
Views
2K
  • · Replies 1 ·
Replies
1
Views
3K
  • · Replies 2 ·
Replies
2
Views
1K
Replies
3
Views
2K
  • · Replies 2 ·
Replies
2
Views
2K
  • · Replies 1 ·
Replies
1
Views
3K
Replies
18
Views
2K