# Multivariable Differentiation .... McInerney Definition 3.1.1 ....

Gold Member
MHB
I am reading Andrew McInerney's book: First Steps in Differential Geometry: Riemannian, Contact, Symplectic ... and I am focused on Chapter 3: Advanced Calculus ... and in particular on Section 3.1: The Derivative and Linear Approximation ...

I am trying to fully understand Definition 3.1.1 and need help with an example based on the definition ...

I constructed the following example ...

Let ##f: \mathbb{R} \to \mathbb{R}^2 ##

such that ##f = ( f^1, f^2 )##

where ##f^1(x) = 2x## and ##f^2(x) = 3x + 1##

We wish to determine ##T_a(h)## ...

We have ##f(a + h) = ( f^1(a + h), f^2(a + h) )= (2a + 2h, 3a + 3h +1 )##

and

##f(a ) = ( f^1(a ), f^2(a ) ) = (2a , 3a +1 )##

Now ... consider ...

##\displaystyle \lim_{ \mid \mid h \mid \mid \to 0} \frac{ \mid \mid f(a + h) - f(a) - T_a(h) \mid \mid }{ \mid \mid h \mid \mid } ##

##\Longrightarrow \displaystyle \lim_{ \mid \mid h \mid \mid \to 0} \frac{ \mid \mid (2a + 2h, 3a + 3h +1) - (2a, 3a + 1) - T_a(h) \mid \mid }{ \mid \mid h \mid \mid }##

##\Longrightarrow \displaystyle \lim_{ \mid \mid h \mid \mid \to 0} \frac{ \mid \mid ( 2h, 3h ) - T_a(h) \mid \mid }{ \mid \mid h \mid \mid }##

... ... but how do I proceed from here ... ?

... can I take ##T_a (h) = T_a.h## ... but how do I justify this?

Hope someone can help ...

Peter

#### Attachments

• McInerney - Definition 3.1.1.png
7.5 KB · Views: 701

Homework Helper
Gold Member
It has what I think is a typo: ## −|T_a(h)| ## should be outside the expression and without an ## ||h||## below it. ## \\ ## Was denkst Du (German) @fresh_42 ?, i.e. might you have an input here? I think the author goofed. ## \\ ## ## T_a (h)=(2,3) ## as ## h \rightarrow 0 ##. Again the author goofed. All this requires is a simple knowledge of calculus to see the author goofed. ## \\ ## I could be wrong, but I think I have this correct here.

Last edited:
First, in your example, ##h \in \mathbb{R}## so you can write ##|h|## instead of ##\Vert h \Vert##.

You are looking for a linear transformation ##T_a## such that ##\Vert (2h, 3h) - T_a(h) \Vert/ |h| \to 0## when ##h \to 0##.

In this case it is quite easy: we can simply make the numerator ##0## for all h such that the limit will be ##0## as well.

Simply pick ##T_a## the linear map defined by ##T_a(h) = (2h, 3h)##.

When you will progress in differentiation theory, you will see that you don't always need the definition to find the linear map. You will show that ##T_a## is the linear map determined by the Jacobian matrix (the one with all the partial derivatives as entries), and that's here the case as well.

Last edited by a moderator:
Homework Helper
Gold Member
In this example the linear function is ##T_a(h)=(2h,3h)##. In fact, that will be the case for any ##a## because the function ##f## is affine, which means it is a combination of a linear function with a translation.

Golly I'm having trouble with the new interface. It has taken me about six goes to make MathJax say what I want it to.

PS the general rule is, for n=1n=1, we have:
$$T_a(h)=\left(\frac{df^1}{dx}(a),...,\frac{df^m}{dx}(a)\right)$$
I am not brave enough to try writing what it is when ##n>1## before I get used to the new MathJax. Too many super- and sub-scripts.

Last edited:
It has what I think is a typo: ## −|T_a(h)| ## should be outside the expression and without an ## ||h||## below it. ## \\ ## Was denkst Du (German) @fresh_42 ?, i.e. might you have an input here? I think the author goofed.

The definition is perfectly fine. It is the same one as Rudin, Apostol and Spivak use in resp. Principles of mathematial analysis, mathematical analysis and calculus on manifolds.

In this example the linear function is ##T_a(h)=(2(h-a),3(h-a))##. In fact, that will be the case for any ##a## because the function ##f## is affine, which means it is a combination of a linear function with a translation.

That function is not a linear transformation, but an affine one. We need ##a = 0##.

Homework Helper
Gold Member

I don't understand this post. Are you suggesting that the definition of the author is wrong? I don't think so to be honest. Can you write clearly out what you think should be the correct definition of derivative then?

Homework Helper
Gold Member
## Df(a)=\frac{f(a+h)-f(a)}{h}=T_a(h) ## as ## h \rightarrow 0 ##.

It has what I think is a typo: ## −|T_a(h)| ## should be outside the expression and without an ## ||h||## below it. ## \\ ## Was denkst Du (German) @fresh_42 ?, i.e. might you have an input here? I think the author goofed. ## \\ ## ## T_a (h)=(2,3) ## as ## h \rightarrow 0 ##. Again the author goofed. All this requires is a simple knowledge of calculus to see the author goofed. ## \\ ## I could be wrong, but I think I have this correct here.

Also, ##T_a(h) = (2,3)## is no linear transformation. It does not send ##0## to ##0##. It should be ##T_a(h) = (2h,3h)##, as in post 3.

Are you maybe confusing the matrix and the linear transformation? Some authors define the derivative to be the matrix, not the linear transform.

Homework Helper
Gold Member
## \frac{f(a+h)-f(a)}{h}=T_a(h) ## as ## h \rightarrow 0 ##.
The limit of the LHS as ##h\to 0## is the vector (2,3), and the limit of the RHS is the vector (0,0).
If we call the limit of the LHS ##Df(a)## then we have:

$$T_a(h) = h\ Df(a) = h\ (2, 3) = (2h,3h)$$

Homework Helper
Gold Member
@andrewkirk I agree=that fixes everything. The author says ## Df(a)=T_a=T_a(0) ## and that is clearly incorrect.

member 587159
Homework Helper
Gold Member
@Math_QED ## \\ ## Rewriting my post 9: ## Df(a)=\frac{f(a+h)-f(a)}{h}=\frac{T_a(h)}{h}=D_a(h) ## as ## h \rightarrow 0 ## , ignoring the normalization lines, or whatever they call them.

member 587159
Mentor
2022 Award
Also, ##T_a(h) = (2,3)## is no linear transformation. It does not send ##0## to ##0##. It should be ##T_a(h)=(2h,3h)##, as in post 3.
That's a bit nitpicking with a bad hand. E.g. you say that the Jacobi matrix is a linear map, but don't allow ##J=(2)## be called linear? If we say ##(2,3)## is a linear map, then ##h\longmapsto (2,3)\cdot h## is automatically meant. There is no reason to assume ##+(2,3)## (or ##\equiv (2,3)##) which you silently did.

That's a bit nitpicking with a bad hand. E.g. you say that the Jacobi matrix is a linear map, but don't allow ##J=(2)## be called linear? If we say ##(2,3)## is a linear map, then ##h\longmapsto (2,3)\cdot h## is automatically meant. There is no reason to assume ##+(2,3)## (or ##\equiv (2,3)##) which you silently did.

I didn't call the Jacobi matrix a linear map. I said it determines a linear map (by multiplying a vector with it). I am aware that linear maps and matrices can be identified under a vector space isomorphism, but that didn't seem relevant here as this is very elementary and identifications can be confusing.

Gold Member
It has what I think is a typo: ## −|T_a(h)| ## should be outside the expression and without an ## ||h||## below it. ## \\ ## Was denkst Du (German) @fresh_42 ?, i.e. might you have an input here? I think the author goofed. ## \\ ## ## T_a (h)=(2,3) ## as ## h \rightarrow 0 ##. Again the author goofed. All this requires is a simple knowledge of calculus to see the author goofed. ## \\ ## I could be wrong, but I think I have this correct here.
But then you would be subtracting a scalar within the norm from the matrix ( or viceversa), getting an expression that is not a real number. How do we then get the derivative ( at the given point), which is a number?

Homework Helper
Gold Member
But then you would be subtracting a scalar within the norm from the matrix ( or viceversa), getting an expression that is not a real number. How do we then get the derivative ( at the given point), which is a number?
My post 13 which is an addendum and minor correction to post 9 answers the question. See also post 11 by @andrewkirk It is unclear at this time whether the textbook or the OP left off the factor of ## h ## in the equation ## h D_a(h)=T_a(h) ## that made it incorrect in the original OP. The problem has been solved, and I am also now in agreement with the posts by @Math_QED .

member 587159, Math Amateur and WWGD
Gold Member
If I may, to the OP, just warn about something many, including myself, find confusing: the terms : differential, derivative, which is the map, which is the value of the map, etc.

Gold Member
MHB
Thanks to all who contributed thoughts and analysis ...

Peter

Mentor
2022 Award
If I may, to the OP, just warn about something many, including myself, find confusing: the terms : differential, derivative, which is the map, which is the value of the map, etc.
... and a linear map on an algebra, e.g. a derivative of a function space, which obeys ##D(fg)=D(f)g+fD(g)## is called a derivation

Math Amateur
Homework Helper
Dearly Missed
@Math_QED ## \\ ## Rewriting my post 9: ## Df(a)=\frac{f(a+h)-f(a)}{h}=\frac{T_a(h)}{h}=D_a(h) ## as ## h \rightarrow 0 ## , ignoring the normalization lines, or whatever they call them.

I think what the OP wrote originally is correct; it just amounts to saying that
$$f(a+h) = f(a) + D_a(h) + r(h),$$ where ##D_a(h) ## is a linear function of h (of the form ##c_1 h_1 + c_2 h_2 + c_3 h_3## for a 3-variable function ##f##) and ##r(h)## is of "higher than first order" in ##h##--that is, ##r(h)/\|h \| \to 0## as ##h \to 0.## The term ##r(h)## describes the deviation between the actual function surface (curved) and the tangent-plane (flat).

Homework Helper
Inded the definition given in post #1 is absolutely correct. One big confusion here is between the terms linear and affine. Note that the variable which could be called x, equals a+h, so h = x-a. Thus a function which is linear in h, is affine in x. I.e. when discussing a function's derivative, or differential, at a, we are looking at linear functions on the tangent space at a, which consists of vectors h added to a, so we want to consider a itself as the origin of the tangent space at a, so the point x should represent the tangent vector h = x-a. This confuses almost everyone, certainly me. I once had to break it gently to a full professor colleague, and a highly trained analyst at that, that he had made this error in correcting (incorrectly) the prelim exam of a grad student.

Another possible confusion, less common today, is that between whether the derivative is a (some) numbers, or a linear transformation. Permit me to quote the inimitable Jean Dieudonne', writing in 1960: (Foundations of modern analysis), chapter VIII:

"..[Differential] calculus is [here] presented in a way that will probably be new to most students...namely the local approximation of functions by linear functions. In the classical teaching of calculus, this idea is immediately obscured by the accidental fact that, on a one dimensional vector space, there is a one to one correspondence between linear forms and numbers, and therefore the derivative is defined as a number instead of a linear form. This slavish subservience to the shibboleth of numerical interpretation at any cost, becomes much worse when dealing with functions of several variables: one thus arrives at the classical formula giving the partial derivatives of a composite function, which has lost any trace of intuitive meaning, whereas the natural statement of the theorem is of course that the (total) derivative of a composite function is the composite of their derivatives, a very sensible formulation when one thinks in terms of linear approximations."

After reading this, some 50 years ago, I have never got over the impression it made on me, nor forgotten its message. Still almost 60 years later, this message has not reached everywhere, due to its relative abstractness. Hence it would be well if current authors would work harder to make the transition more palatable and clear.

Last edited:
lavinia, member 587159 and Charles Link
Homework Helper
Indeed I may have added to the confusion by quoting Dieudonne' speaking of "approximating functions by linear functions" without being more precise. It has been observed that the function f(x) is being approximated near x=a, by the function f(a) + Ta(x-a), or f(a) + Ta(h), where h = x-a. Now this function is still affine as a function of h. The point here is that we are approximating, not f, but f - f(a), by a linear function. Thus indeed f itself is approximated by an affine function. However when one relocates the coordinate system at the point (a, f(a)), we think of approximating the function f(a+h) - f(a).

I.e. in old fashioned terminology, we are considering "changes", and the point is that if f(x) = y, we are claiming that the change in y is approximated by a linear function of the change in x. Loomis makes this very clear on page 141 of his book with Sternberg, Advanced Calculus (available free on Sternbergs page at Harvard: http://www.math.harvard.edu/~shlomo/).

I.e. write ∆af(h) = f(a+h) - f(a), which is the change in y, or the change in f. Then we want to approximate ∆af(h) by the linear function Daf(h), Loomis' notation for the linear function Ta(h). We get that ∆af(h) = Daf(h) + o(h), where o(h) is an "infinitesimal", i.e. a function of h that approaches zero faster than h does, as h-->0, i.e. such that o(h)/|h| -->0, as h-->0.

Briefly, at each point a, we approximate ∆f, the change in f, by a linear function Df, of the change in x.

Last edited: