I The Differentail & Derivative in Multivariable Analysis ....

Math Amateur · Feb 19, 2018

I am reading the book "Several Real Variables" by Shmuel Kantorovitz ... ...

I am currently focused on Chapter 2: Derivation ... ...

I need help with an aspect of Kantorovitz's definition of "differential" ...

Kantorovitz's Kantorovitz's definition of "differential" reads as follows:

Kantorovitz - 1 - Defn 2.1.3 ... Definition of differentiable and differential ... PART 1 ... .png

Kantorovitz - 2 - Defn 2.1.3 ... Definition of differentiable and differential ... PART 2 ... .png

Is the "differential" as defined by Kantorovitz for real-valued functions of several real variables the same as 'the derivative" ...

If so ... is it the same situation for vector-valued functions of several real variables ...

... and ... further to the above ... is the gradient, ##\bigtriangledown f## the differential/derivative for real-valued functions of several real variables, ##f## ... ...Help will be appreciated ...

Peter

andrewkirk · Feb 19, 2018

There isn't really a concept of 'the derivative' of a function ##f:\mathbb R^k\to \mathbb R##. There are ##k## partial derivatives though, being ##\frac{\partial f}{\partial x_j}## for ##1\leq j\leq k##.

For functions ##f:\mathbb R^k\to \mathbb R^n## the differential is a linear function from ##\mathbb R^k## to ##\mathbb R^n##, whose ##n\times k## matrix is called the Jacobian. The above differential is a special case of a ##1\times k## Jacobian matrix.

The above differential is the same as the gradient. Both are represented by a ##1\times k## matrix (ie row vector) whose ##j##th element is ##\frac{\partial f}{\partial x_j}##.

Note however that in the 3D vector calculus often used in engineering, the gradient is a vector, rather than a linear functional. Since your book sounds like pure maths, a long way from engineering, that is unlikely to be a point of confusion.

A linear functional is also variously known as a 'dual vector', 'co-vector', 'one-form', 'linear form' or ##\pmatrix{ 0\\1}## tensor (or maybe ##\pmatrix{ 1\\0}## tensor. I can never remember which way round it goes. The other one is just an ordinary vector). It eats vectors and spits out real numbers.

Math Amateur · Feb 19, 2018

Thanks Andrew ... that was most helpful ...

Peter

mathwonk · Feb 19, 2018

note that the operation of taking the dot product with a vector is a linear functional. Sometimes one uses the term "gradient" for the vector such that dotting with it gives the differential. I.e. if grad(f) is the gradient vector, then the operation <grad(f), >, thought of as a linear function of the second variable, is the differential.

Moreover, although the term "differential" is common (e.g. in the book of Loomis and Sternberg), there are older books, (Lang, Analysis I and II, also Dieudonne', Foundations of modern analysis) wherein the word "derivative" (or "total derivative"), is used in place of "differential". In some places, what is now called the differential, is called the "Fre'chet derivative", especially in the context of infinite dimensions.
https://en.wikipedia.org/wiki/Fréchet_derivative

Terminology is always somewhat variable, both in time and from book to book.

technically, the thing to remember is that the property of having a total derivative, or differential, is stronger than that of having partial derivatives. I.e. in finite dimensions, if a function has a differential at a point, then it also has partial derivatives at that point, and the matrix representing the differential has those partial derivatives as entries. However even if the partial derivatives exist at a point, the matrix representing them may not define a linear function that has the defining property of the differential above. I.e. they do define a linear function, but that linear function may not give a sufficiently close approximation to the original function to be called the differential. Fortunately, if the partial derivatoves exist not just at the given point, but in a whole neighborhood of that point, and if those partials are continuous at the given point, then the linear function they represent does define the differential there, and in particular the function is differentiable there in the total or Fre'chet sense. This is the most useful criterion for deciding that a function is differentiable, rather than the definition, which states a property that can be quite hard to check.

Math Amateur · Feb 19, 2018

mathwonk said:

note that the operation of taking the dot product with a vector is a linear functional. Sometimes one uses the term "gradient" for the vector such that dotting with it gives the differential. I.e. if grad(f) is the gradient vector, then the operation <grad(f), >, thought of as a linear function of the second variable, is the differential.

Moreover, although the term "differential" is common (e.g. in the book of Loomis and Sternberg), there are older books, (Lang, Analysis I and II, also Dieudonne', Foundations of modern analysis) wherein the word "derivative" (or "total derivative"), is used in place of "differential". In some places, what is now called the differential, is called the "Fre'chet derivative", especially in the context of infinite dimensions.
https://en.wikipedia.org/wiki/Fréchet_derivative

Terminology is always somewhat variable, both in time and from book to book.

technically, the thing to remember is that the property of having a total derivative, or differential, is stronger than that of having partial derivatives. I.e. in finite dimensions, if a function has a differential at a point, then it also has partial derivatives at that point, and the matrix representing the differential has those partial derivatives as entries. However even if the partial derivatives exist at a point, the matrix representing them may not define a linear function that has the defining property of the differential above. I.e. they do define a linear function, but that linear function may not give a sufficiently close approximation to the original function to be called the differential. Fortunately, if the partial derivatoves exist not just at the given point, but in a whole neighborhood of that point, and if those partials are continuous at the given point, then the linear function they represent does define the differential there, and in particular the function is differentiable there in the total or Fre'chet sense. This is the most useful criterion for deciding that a function is differentiable, rather than the definition, which states a property that can be quite hard to check.

Thanks mathwonk ... most helpful!

Indeed ... I have just done some checking of other texts than Kantorovitz and found exactly what you have said ... and some of the books using the term 'total derivative' are (moderately) recent ... compared to Lang and Dieudonne anyway ... . In particular, in the book "Multidimensional Real Analysis I: Differentiation" by J. J. Duistermaat and J. A. C. Kolk, (Cambridge University Press, 2004) the definition of differentiability for vector-valued multivariable functions includes the term (total) derivative as follows:

D&K - 1 - Defn 2.2.2 ... ... PART 1 ... .png

D&K - 2 - Defn 2.2.2 ... ... PART 2 ... .png

mathwonk · Feb 20, 2018

The thing to remember is that the linear map is intrinsically determined by the original map, and has intrinsic meaning for that map, as the best possible linear approximation to it, locally at the given point. Then partial derivatioves on the other hand do not have intrinsic meaning except with respect to the particular basis for the source vector space. I.e. they are coefficients of the derivative linear map with respect to a given basis. A lin ear map is determined by its values on the elements of a basis, and the partial derivatives are those values. Then partial derivatives are directional derivatives, in the given basis directions. Change the basis and you change the partial derivatives, i.e. the coefficients rpresenting the linear map, but you do not change the linear map itself.

On the other hand it is hard to describe the linear map explicitly without giving a basis and coefficients in terms of tht basis. It is also hard to prove the derivative exists without choosing a basis and calculating the partials in that basis. So the derivative has the iny=trinsic meaning, but it is cal;culated in terms of the aortal derivatives. So the two concepts go hand in hand and complement each other. IOn infinite dimensions where bases are more difficult to come by, and matrices are not available, partial are not as useful and derivatives are harder to work with. E.g. if we want to discuss curves of minimal length, we want to consider an infinite dimensional space of paths and take the derivative of the length function, but this is quite challenging, and I believe it provides the first problem in the calculus of variations, solved by Euler. Indeed a little research in the book vol.2, ch.VII, of Courant's calculus indicates that a variety of problems of this nature, involving the minimizing of an integral formula taken over a variable curve, are reduced to the solution of "Euler's differential equation", which I presume is merely the result of computing a certain infinite dimensional derivative and setting it equal to zero.

Note that in infinite dimensions, since partials are not available, (indeed complete infinite dimensional vector spaces are always uncountably infinite dimensional), one is forced to use the linear map definition of derivative. But it is then challenging to make computations with them.

Already in finite dimensions however the linear map definition of derivative is very useful, especially in simplifying the chain rule. Here the complicated formulas expressing partial derivatives of a composite map in terms of the partial derivatives of the components, is replaced by the simple statement that the derivative of a composition is merely the composition of the derivatives. The formulas in terms of partials are then easily recovered from the rules for matrix multiplication, i.e. composition, for linear maps.

Math Amateur · Feb 20, 2018

mathwonk said:

The thing to remember is that the linear map is intrinsically determined by the original map, and has intrinsic meaning for that map, as the best possible linear approximation to it, locally at the given point. Then partial derivatioves on the other hand do not have intrinsic meaning except with respect to the particular basis for the source vector space. I.e. they are coefficients of the derivative linear map with respect to a given basis. A lin ear map is determined by its values on the elements of a basis, and the partial derivatives are those values. Then partial derivatives are directional derivatives, in the given basis directions. Change the basis and you change the partial derivatives, i.e. the coefficients rpresenting the linear map, but you do not change the linear map itself.

On the other hand it is hard to describe the linear map explicitly without giving a basis and coefficients in terms of tht basis. It is also hard to prove the derivative exists without choosing a basis and calculating the partials in that basis. So the derivative has the iny=trinsic meaning, but it is cal;culated in terms of the aortal derivatives. So the two concepts go hand in hand and complement each other. IOn infinite dimensions where bases are more difficult to come by, and matrices are not available, partial are not as useful and derivatives are harder to work with. E.g. if we want to discuss curves of minimal length, we want to consider an infinite dimensional space of paths and take the derivative of the length function, but this is quite challenging, and I believe it provides the first problem in the calculus of variations, solved by Euler. Indeed a little research in the book vol.2, ch.VII, of Courant's calculus indicates that a variety of problems of this nature, involving the minimizing of an integral formula taken over a variable curve, are reduced to the solution of "Euler's differential equation", which I presume is merely the result of computing a certain infinite dimensional derivative and setting it equal to zero.

Note that in infinite dimensions, since partials are not available, (indeed complete infinite dimensional vector spaces are always uncountably infinite dimensional), one is forced to use the linear map definition of derivative. But it is then challenging to make computations with them.

Already in finite dimensions however the linear map definition of derivative is very useful, especially in simplifying the chain rule. Here the complicated formulas expressing partial derivatives of a composite map in terms of the partial derivatives of the components, is replaced by the simple statement that the derivative of a composition is merely the composition of the derivatives. The formulas in terms of partials are then easily recovered from the rules for matrix multiplication, i.e. composition, for linear maps.

Thanks mathwonk for some more really helpful insights ...

You don't tend to get these insights via the usual textbooks ... well ... I don't anyway ...

Thanks again ...

Peter

mathwonk · Feb 20, 2018

you are very welcome. it is a pleasure to share insights gained after over 50 years contemplating these ideas. dieudonne is an especilly good book, but not so easy to read. no matter, it repays slow reading. also courant. easier to read and also very good are books by mike spivak.

Math Amateur · Feb 21, 2018

Thanks mathwonk ...

Your posts are valuable ...

May look into getting a copy of Jean Dieudonne's analysis book ...

Peter

I The Differentail & Derivative in Multivariable Analysis ....

Attachments

Attachments

Similar threads

Hot Threads

I No structure in ##(d,x)## in ##\textbf{Met*}(X)## is admissible

I How are the following three definitions subtly different?

I Convergence not defined by any metric

I Confused by proof in Hocking and Young

I Homemorphism in quotient topology

Recent Insights

Insights Why Entangled Photon-Polarization Qubits Violate Bell’s Inequality

Insights Quantum Entanglement is a Kinematic Fact, not a Dynamical Effect

Insights What Exactly is Dirac’s Delta Function? - Insight

Insights Relativator (Circular Slide-Rule): Simulated with Desmos - Insight

Insights Fixing Things Which Can Go Wrong With Complex Numbers

Insights Fermat's Last Theorem